{"id":34,"date":"2023-04-24T15:16:24","date_gmt":"2023-04-24T19:16:24","guid":{"rendered":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/?page_id=34"},"modified":"2023-04-30T17:39:22","modified_gmt":"2023-04-30T21:39:22","slug":"how-to-clean-data","status":"publish","type":"page","link":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/measuring-impact\/how-to-clean-data\/","title":{"rendered":"How to Clean Data"},"content":{"rendered":"\r\n<h2 class=\"wp-block-heading\">Overview<\/h2>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\">\r\n<li><strong>Time:<\/strong> 1.5 hours (5 pre-workshop, 20 teaching, 60 activity, 5 wrap-up)<\/li>\r\n\r\n\r\n\r\n<li><strong>Objectives<\/strong>\r\n<ul class=\"wp-block-list\">\r\n<li>Understand what makes tidy\/clean data<\/li>\r\n\r\n\r\n\r\n<li>Be able to name the importance of tidy\/clean data as it relates to your own organization\u2019s data<\/li>\r\n\r\n\r\n\r\n<li>Gain experience using OpenRefine as a tool to assist in cleaning data<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\">Pre-Workshop &amp; Required Materials<\/h2>\r\n\r\n\r\n\r\n<p>This workshop requires some additional resources.\u00a0 We will be working alongside one another to tidy a dataset and use OpenRefine, but first we have to ensure OpenRefine and our data is downloaded.\u00a0 This dataset is meant to emulate possible contact information for a nonprofit organization.\u00a0 See the directions below:<\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\">\r\n<li>Download the dataset from the link below:<\/li>\r\n<\/ul>\r\n\r\n\r\n\r\n<div class=\"wp-block-file\" style=\"padding-left: 80px\"><a id=\"wp-block-file--media-848be106-d18b-4976-97b1-5af879830600\" href=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/MeasuringImpact_OpenRefineData-Sheet1.csv\">MeasuringImpact_OpenRefineData-Sheet1<\/a><a class=\"wp-block-file__button wp-element-button\" href=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/MeasuringImpact_OpenRefineData-Sheet1.csv\" aria-describedby=\"wp-block-file--media-848be106-d18b-4976-97b1-5af879830600\">Download<\/a><\/div>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\">\r\n<li><a href=\"https:\/\/openrefine.org\/download.html\">Download the appropriate version of OpenRefine for your device (Windows\/Mac)<\/a><br \/>On a Windows computer, this application will download as a .zip file containing these files:<\/li>\r\n<\/ul>\r\n\r\n\r\n<div class=\"wp-block-image\">\r\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-42\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-231024-1024x356.png\" alt=\"\" width=\"677\" height=\"235\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-231024-1024x356.png 1024w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-231024-300x104.png 300w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-231024-768x267.png 768w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-231024.png 1157w\" sizes=\"auto, (max-width: 677px) 100vw, 677px\" \/><\/figure>\r\n<\/div>\r\n\r\n\r\n<p style=\"padding-left: 40px\">When your screen looks like this, open the application named \u201copenrefine.\u201d\u00a0 At this point, you may be prompted to Extract All files in the folder, Run anyways, or cancel.\u00a0 Select the option to Extract All.\u00a0 This will prompt another popup window in which the system will prompt you to choose where you would like the unzipped files to go.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-45\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-231649.png\" alt=\"\" width=\"414\" height=\"344\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-231649.png 711w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-231649-300x249.png 300w\" sizes=\"auto, (max-width: 414px) 100vw, 414px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">I recommend leaving this as the default, as it will create another folder in your Downloads with all of the unzipped files, which looks a little something like this:<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-107\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-232042.png\" alt=\"\" width=\"577\" height=\"77\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-232042.png 760w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-232042-300x40.png 300w\" sizes=\"auto, (max-width: 577px) 100vw, 577px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">Open the unzipped folder, which is labeled \u201cFile folder\u201d as opposed to being Compressed.\u00a0 You should see the following files.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-111\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-233850-1.png\" alt=\"\" width=\"423\" height=\"273\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-233850-1.png 670w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-233850-1-300x193.png 300w\" sizes=\"auto, (max-width: 423px) 100vw, 423px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">Again, open the application file entitled \u201copenrefine.\u201d\u00a0 You will likely receive an error message such as the one below, given that OpenRefine is not a Microsoft application.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-112\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-234119.png\" alt=\"\" width=\"356\" height=\"331\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-234119.png 658w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-234119-300x279.png 300w\" sizes=\"auto, (max-width: 356px) 100vw, 356px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">In this window, click \u201cMore Info.\u201d\u00a0 Another button will appear at the bottom.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-113\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-234904.png\" alt=\"\" width=\"361\" height=\"336\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-234904.png 662w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-234904-300x279.png 300w\" sizes=\"auto, (max-width: 361px) 100vw, 361px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">In this window, click \u201cRun anyway\u201d to open the screen below.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-114\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-235027-1024x553.png\" alt=\"\" width=\"760\" height=\"410\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-235027-1024x553.png 1024w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-235027-300x162.png 300w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-235027-768x415.png 768w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-03-29-235027.png 1526w\" sizes=\"auto, (max-width: 760px) 100vw, 760px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p>Congratulations! You have successfully opened OpenRefine and are ready for the workshop.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<h2 class=\"wp-block-heading\">Introduction to Tidy Data Concepts &amp; Materials<\/h2>\r\n<p>\r\n\r\n<\/p>\r\n<p>In this section we will discuss tidy data and the principles that undergird &#8220;useful&#8221; or &#8220;well-formatted&#8221; data, as well as the software that we will be using.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<h3 class=\"wp-block-heading\"><em>What is tidy\/clean data?<\/em><\/h3>\r\n<p>\r\n\r\n<\/p>\r\n<p>Hadley Wickham, a statistician from New Zealand, wrote an <a href=\"https:\/\/www.jstatsoft.org\/article\/view\/v059i10\">article in 2014 in the <em>Journal of Statistical Software<\/em><\/a> that is now considered the basis for defining clean\/tidy data.\u00a0 He notes the following:<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\r\n<p>Tidy datasets are easy to manipulate, model and visualize, and have a specific structure: each variable is a column, each observation is a row, and each type of observational unit is a table. This framework makes it easy to tidy messy datasets because only a small set of tools are needed to deal with a wide range of un-tidy datasets. This structure also makes it easier to develop tidy tools for data analysis, tools that both input and output tidy datasets. The advantages of a consistent data structure and matching tools are demonstrated with a case study free from mundane data manipulation chores.&#8221;<\/p>\r\n<cite>(Hadley Wickham, Tidy Data, Vol. 59, Issue 10, Sep 2014, Journal of Statistical Software. <a href=\"http:\/\/www.jstatsoft.org\/v59\/i10\">http:\/\/www.jstatsoft.org\/v59\/i10<\/a>.)<\/cite><\/blockquote>\r\n<p>\r\n\r\n<\/p>\r\n<p>As people who work with data day-to-day, what does this mean for us?\u00a0 Well, we often use spreadsheets to provide structure to data.\u00a0 \u201cTidy\u201d data involves spreadsheets configured along a certain set of principles so that it is easiest to work with this data.\u00a0 So, what are these principles?\u00a0 Dr. Katie Walden, Professor of American Studies at Notre Dame, summarizes these principles and provides more specifics in <a href=\"https:\/\/github.com\/kwaldenphd\/tidy-data-principles\">her lab on OpenRefine:<\/a><\/p>\r\n<p>\r\n\r\n<\/p>\r\n<ol class=\"wp-block-list\">\r\n<li>Be consistent\r\n<ul class=\"wp-block-list\">\r\n<li>Use consistent codes for categorical variables<\/li>\r\n\r\n\r\n\r\n<li>Use a consistent fixed code for any missing values<\/li>\r\n\r\n\r\n\r\n<li>Use consistent variable names<\/li>\r\n\r\n\r\n\r\n<li>Use consistent subject identifiers<\/li>\r\n\r\n\r\n\r\n<li>Use a consistent data layout in multiple files<\/li>\r\n\r\n\r\n\r\n<li>Use consistent file names<\/li>\r\n\r\n\r\n\r\n<li>Use a consistent format for all dates<\/li>\r\n\r\n\r\n\r\n<li>Use consistent phrases in your notes<\/li>\r\n\r\n\r\n\r\n<li>Be careful about extra spaces within cells<\/li>\r\n<\/ul>\r\n<\/li>\r\n\r\n\r\n\r\n<li>Choose good names\r\n<ul class=\"wp-block-list\">\r\n<li>Avoid spaces<\/li>\r\n\r\n\r\n\r\n<li>Avoid special characters<\/li>\r\n\r\n\r\n\r\n<li>Be short but meaningful<\/li>\r\n<\/ul>\r\n<\/li>\r\n\r\n\r\n\r\n<li>Write dates as YYYY-MM-DD\r\n<ul class=\"wp-block-list\">\r\n<li>Or have separate columns for YEAR, MONTH, DATE<\/li>\r\n<\/ul>\r\n<\/li>\r\n\r\n\r\n\r\n<li>No empty cells<\/li>\r\n\r\n\r\n\r\n<li>Put just one thing in a cell<\/li>\r\n\r\n\r\n\r\n<li>Make it a rectangle\r\n<ul class=\"wp-block-list\">\r\n<li>Single first row with variable names<\/li>\r\n<\/ul>\r\n<\/li>\r\n\r\n\r\n\r\n<li>Create a data dictionary\r\n<ul class=\"wp-block-list\">\r\n<li>You might also find this information in a codebook that goes with a dataset<\/li>\r\n\r\n\r\n\r\n<li>Things to include:\r\n<ul class=\"wp-block-list\">\r\n<li>The exact variable name as in the data file<\/li>\r\n\r\n\r\n\r\n<li>A version of the variable name that might be used in data visualizations<\/li>\r\n\r\n\r\n\r\n<li>A longer explanation of what the variable means<\/li>\r\n\r\n\r\n\r\n<li>The measurement units<\/li>\r\n\r\n\r\n\r\n<li>Expected minimum and maximum values<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\n<\/li>\r\n\r\n\r\n\r\n<li>No calculations in the raw data files<\/li>\r\n\r\n\r\n\r\n<li>Do not use font color\/highlight as data<\/li>\r\n\r\n\r\n\r\n<li>Make backups\r\n<ul class=\"wp-block-list\">\r\n<li>Multiple locations (OneDrive, local computer, etc.)<\/li>\r\n\r\n\r\n\r\n<li>Version control program (i.e. Git)<\/li>\r\n\r\n\r\n\r\n<li>Write protect the file when not entering data<\/li>\r\n<\/ul>\r\n<\/li>\r\n\r\n\r\n\r\n<li>Use data validation to avoid errors<\/li>\r\n\r\n\r\n\r\n<li>Save a copy of the data in plain text files\r\n<ul class=\"wp-block-list\">\r\n<li>File formats can include comma-separated values (CSV) or plain-text (TXT)<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ol>\r\n<p>\r\n\r\n<\/p>\r\n<p>Additionally, the <a href=\"https:\/\/librarycarpentry.org\/lc-spreadsheets\/02-common-mistakes\/index.html\">Carpentries lab<\/a> notes some common errors in spreadsheets:<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<ul class=\"wp-block-list\">\r\n<li>Multiple tables<\/li>\r\n\r\n\r\n\r\n<li>Multiple tabs<\/li>\r\n\r\n\r\n\r\n<li>Not filling in zeros<\/li>\r\n\r\n\r\n\r\n<li>Using bad null values<\/li>\r\n\r\n\r\n\r\n<li>Using formatting to convey information<\/li>\r\n\r\n\r\n\r\n<li>Using formatting to make the data sheet look pretty<\/li>\r\n\r\n\r\n\r\n<li>Placing comments or units in cells<\/li>\r\n\r\n\r\n\r\n<li>More than one piece of information in a cell<\/li>\r\n\r\n\r\n\r\n<li>Field name problems<\/li>\r\n\r\n\r\n\r\n<li>Special characters in data<\/li>\r\n\r\n\r\n\r\n<li>Inclusion of metadata in data table<\/li>\r\n\r\n\r\n\r\n<li>Date formatting<\/li>\r\n<\/ul>\r\n<p>\r\n\r\n<\/p>\r\n<p>At this point, you may still be wondering what tidy data looks like.\u00a0 This will likely become much more clear in our lab component of this workshop.\u00a0 But first, let\u2019s learn a bit about the software we will be using.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<h3 class=\"wp-block-heading\"><em>What is OpenRefine?<\/em><\/h3>\r\n<p>\r\n\r\n<\/p>\r\n<p>According to their <a href=\"https:\/\/openrefine.org\/\">website<\/a>, OpenRefine is a \u201cpowerful free, open source tool for working with messy data: cleaning it, transforming it from one format into another; and extending it with web services and external data.\u00a0 The <a href=\"https:\/\/librarycarpentry.org\/lc-open-refine\/01-introduction\/index.html\">Carpentries introduction to OpenRefine<\/a> explains that the tool is \u201cmost useful where you have data in a simple tabular format such as a spreadsheet, a comma separated values file (csv) or a tab delimited file (tsv) but with internal inconsistencies either in data formats, or where data appears, or in terminology used\u201d as it can be used to more easily standardize your data and give quick insights.\u00a0 Enough discussion about it though, let\u2019s get started!<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<h2 class=\"wp-block-heading\">Lab Activity<\/h2>\r\n<p>\r\n\r\n<\/p>\r\n<p>In pairs or small groups, follow the sequence of the lab below.\u00a0 Instructors will be available for assistance.\u00a0 Or, if the group prefers, we can conduct a live data cleaning session together.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">1. Ensure that you have a CSV of the following dataset somewhere on your computer.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<div class=\"wp-block-file\" style=\"padding-left: 80px\"><a id=\"wp-block-file--media-55dcd545-035d-4a98-a82e-2bccb8cefe45\" href=\"http:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/MeasuringImpact_OpenRefineData-Sheet1.csv\">MeasuringImpact_OpenRefineData-Sheet1<\/a><a class=\"wp-block-file__button wp-element-button\" href=\"http:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/MeasuringImpact_OpenRefineData-Sheet1.csv\" aria-describedby=\"wp-block-file--media-55dcd545-035d-4a98-a82e-2bccb8cefe45\">Download<\/a><\/div>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">2. Open OpenRefine on your computer.\u00a0 See the instructions above in the Pre-Workshop and Required Materials section.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">3. Once you have OpenRefine in front of you, click the button \u201cChoose Files\u201d.\u00a0 A popup window will appear.\u00a0 Navigate to wherever the Measuring Impact dataset is located on your computer; click the file and then click \u201cOpen\u201d.\u00a0 Once this is complete, you should see the file name next to the \u201cChoose Files\u201d button, like this:<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-117\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-214808.png\" alt=\"\" width=\"418\" height=\"141\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-214808.png 520w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-214808-300x102.png 300w\" sizes=\"auto, (max-width: 418px) 100vw, 418px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">4. Click \u201cNext\u201d.\u00a0 Once uploaded, the data will open on a new screen with several options.\u00a0 Ensure that the option \u201cParse next 1 line(s) as column headers\u201d is selected.\u00a0 Check also that the \u201cParse cell text into numbers, dates,&#8230;\u201d option is <em>not<\/em> selected.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">5. On the top right corner of the screen, there will be a box with the label \u201cProject Name\u201d.\u00a0 Go ahead and give this workshop a name.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">6. Click the top rightmost button on the screen, called \u201cCreate Project &gt;&gt;\u201d.\u00a0 This may take a moment.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">7. Once the data loads, the program will likely default to only display the first 5 records &#8211; don\u2019t worry, they didn\u2019t get deleted!\u00a0 Above your data, click to display 1000 rows so we can see more of our data.\u00a0 We have a total of 600 records, which is far too many to go through manually!\u00a0 Your screen should look something like the one below:<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-121\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-220359-1024x511.png\" alt=\"\" width=\"816\" height=\"407\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-220359-1024x511.png 1024w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-220359-300x150.png 300w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-220359-768x383.png 768w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-220359.png 1134w\" sizes=\"auto, (max-width: 816px) 100vw, 816px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">Note that the window to the left notes information about facets and filters.\u00a0 This includes additional resources, such as screencasts, that can supplement the information provided in this workshop.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">8. Now that the data is open, take a moment to scroll through.\u00a0 With your partner or small group, discuss where you believe there are inconsistencies throughout this dataset and what may not be clean about this dataset, given the information discussed in the previous segment of the workshop.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">9. In OpenRefine, data is listed in a tabular format.\u00a0 They also have \u201cfacets\u201d which are meant to help easily gather information about the data and ensure it is clean and consistent.\u00a0 According to the <a href=\"https:\/\/librarycarpentry.org\/lc-open-refine\/01-introduction\/index.html\">Library Carpentry\u2019s Introduction to OpenRefine<\/a>, \u201ca facet groups all the values that appear in a column, and then allows you to filter the data by these values and edit values across many records at the same time\u201d.\u00a0 Let\u2019s practice by making a text facet.\u00a0 Click the dropdown arrow next to the \u201cstate\u201d column.\u00a0 Navigate to Facet &gt; Text facet.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-122\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-232332.png\" alt=\"\" width=\"231\" height=\"180\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-232332.png 311w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-232332-300x233.png 300w\" sizes=\"auto, (max-width: 231px) 100vw, 231px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">10. As soon as you click this, a window will appear to the left side of your screen that shows the facet, essentially demonstrating all the unique values in that column.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-123\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-232614.png\" alt=\"\" width=\"178\" height=\"173\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">11. Notice when you hover over these values, two options appear: Include and Edit.\u00a0 Include can be used to select multiple values, while Edit can be used to make edits to all values and ensure the data is consistent.\u00a0 As you have likely already noticed, we do not have consistent data; there are capitalization issues, and some people listed the entire state name as opposed to just the two-letter abbreviation.\u00a0 Here, we just want to have the two-letter abbreviations in capital letters.\u00a0 Let\u2019s start with the values \u201cin\u201d.\u00a0 The grey number next to this entry indicates that there are 23 occurrences throughout our dataset.\u00a0 Hover over this value and click \u201cEdit\u201d.\u00a0 A box will appear for you to change this value.\u00a0 Change it to \u201cIN\u201d and click \u201cApply\u201d.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-124\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-233914.png\" alt=\"\" width=\"597\" height=\"168\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-233914.png 707w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-233914-300x84.png 300w\" sizes=\"auto, (max-width: 597px) 100vw, 597px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">12. After doing this, we get a message at the top of the screen, noting that we made a mass edit:<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"289\" height=\"27\" class=\"wp-image-125\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-05-233955.png\" alt=\"\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">And, note that the \u201cstate\u201d facet no longer includes the value \u201cin\u201d, meaning we successfully changed all these values to be consistent with \u201cIN\u201d.\u00a0 Go ahead and edit the rest of these values so they are consistent with IN or MI.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">13. Now that you are familiar with how to use text facets, repeat the process with the age and council_district columns, ensuring that only numerical values remain.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">14. Still, we have some issues that require correction, namely in the first_contacted column, having both first and last names in the first_name column, and inconsistent parent_phone numbers.\u00a0 These issues are a bit more complex to fix.\u00a0 Of course, you can always repeat the same process and manually change each value, but for something like a date, this is not always the easiest way to clean your data.\u00a0 Let\u2019s start with the first_contacted column.\u00a0 Click the dropdown and go to Edit cells &gt; Common Transforms and select \u201cTo Date\u201d.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"125\" height=\"113\" class=\"wp-image-126\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-002414.png\" alt=\"\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">Our column immediately transforms the data into an ISO 8601 date format!<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">15. Unfortunately, we don\u2019t have associated times with our dataset though.\u00a0 We could leave this the way it is, but if we just wanted to keep the date itself, we could make a few edits.\u00a0 First, we need to convert this column back into text so we can cut off the part that we want to keep.\u00a0 Thankfully, OpenRefine will still keep all of our corrected dates so we don\u2019t have miscellaneous text!\u00a0 Click the dropdown button next to first_contacted again and go to Edit Cells &gt; Common Transforms &gt; To Text.\u00a0 This will turn the color of the dates black again.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"124\" height=\"130\" class=\"wp-image-127\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-003646.png\" alt=\"\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">The reason we must do this is because we can only take subsections of text, not dates.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">16. Next, we want to select the arrow by first_contacted again, and this time navigate to Edit Cells &gt; Transform.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-128\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-003816.png\" alt=\"\" width=\"243\" height=\"245\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-003816.png 311w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-003816-298x300.png 298w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-003816-150x150.png 150w\" sizes=\"auto, (max-width: 243px) 100vw, 243px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">17. Once we do this, a popup window will appear.\u00a0 In this window, select \u201cPython \/ Jython\u201d in the Language dropdown, and enter the following code:<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<pre class=\"wp-block-code\" style=\"padding-left: 80px\"><code>value = value[0:10]\r\nreturn value<\/code><\/pre>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-129\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-003926.png\" alt=\"\" width=\"386\" height=\"336\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-003926.png 608w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-003926-300x262.png 300w\" sizes=\"auto, (max-width: 386px) 100vw, 386px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">Because all of the dates are now the same length since being put into ISO 8601 format, we can just tell OpenRefine that we want to take the first 10 characters and get rid of the timestamp.\u00a0 From here, press OK.\u00a0 Our values are now just plain dates!<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"109\" height=\"111\" class=\"wp-image-130\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-004103.png\" alt=\"\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">Again, we could go through manually as well, but this saves us much more time.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">18. Next, let\u2019s tackle the parent_phone column.\u00a0 There are several varieties of the way that people typed in their phone numbers, including parenthesis, hyphens, and periods.\u00a0 For our purposes, we only want the number without any special characters.\u00a0 Let\u2019s start by opening the dropdown by parent_phone and navigating to Edit Cells &gt; Transform again.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">19. Ensure again that the Language is set to Python \/ Jython.\u00a0 Then, input the following code:<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<pre class=\"wp-block-code\" style=\"padding-left: 80px\"><code>value = value.replace(\"(\",\"\").replace(\")\",\"\").replace(\"-\", \"\").replace(\".\", \"\")\r\nreturn value<\/code><\/pre>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">While this may look like a lot of nonsense characters, OpenRefine interprets it as removing the values ( ) &#8211; and . and replacing them with nothing, meaning it simply deletes them and returns the new value, which is the phone number with only numerical information.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">20. Our last issue is that there are several individuals whose last name is in the first_name category along with their first name, instead of being consistently two separate columns.\u00a0 To address this, let\u2019s click the dropdown under first_name and go to Edit column &gt; Split into several columns.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-131\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010040.png\" alt=\"\" width=\"291\" height=\"296\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010040.png 388w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010040-295x300.png 295w\" sizes=\"auto, (max-width: 291px) 100vw, 291px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">21. A popup will appear.\u00a0 Ensure that \u201cby separator\u201d is selected, and in the Separator box, enter one single space.\u00a0 Ensure that the box \u201cRemove this column\u201d is not checked so we can ensure we have the right data after doing this. Press OK.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-132\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010305.png\" alt=\"\" width=\"427\" height=\"207\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010305.png 602w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010305-300x146.png 300w\" sizes=\"auto, (max-width: 427px) 100vw, 427px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">22. After doing this, what do you notice?\u00a0 Our first names are neatly separated, but our last_name and new column, first_name 2 are separated!\u00a0 We need to merge these two columns together so that they show as one cohesive column.\u00a0 Click the dropdown for first_name 2 and select Edit column &gt; Join columns.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">23. A popup window will appear again.\u00a0 On the left side, ensure that first_name 2 and last_name are selected.\u00a0 Nothing else should be changed from the default, and press OK.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-133\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010840.png\" alt=\"\" width=\"550\" height=\"316\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010840.png 799w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010840-300x172.png 300w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010840-768x441.png 768w\" sizes=\"auto, (max-width: 550px) 100vw, 550px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">24. Now we\u2019ve got our data separated neatly, but there are many repetitive columns!<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"387\" height=\"217\" class=\"wp-image-134\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010923.png\" alt=\"\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010923.png 387w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-010923-300x168.png 300w\" sizes=\"auto, (max-width: 387px) 100vw, 387px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">Let\u2019s fix this by deleting the columns we don\u2019t need and renaming the correct ones.<br \/>Click the dropdown for first_name, our original column with errors, and select Edit column &gt; Remove this column.\u00a0 Then, we will do the same for last_name, which was originally missing some data.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">25. After we do this, we have correct and clean data, so we can rename our columns as revised_first_name and revised_last_name by clicking the dropdown arrow next to the name of the column and navigating to Edit column &gt; Rename this column.\u00a0 Our data is now clean and ready to go!<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-135\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-011537.png\" alt=\"\" width=\"672\" height=\"261\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-011537.png 888w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-011537-300x117.png 300w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-011537-768x298.png 768w\" sizes=\"auto, (max-width: 672px) 100vw, 672px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">\u2026.Or is it?<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">Scroll down towards the bottom of the dataset.\u00a0 Can you spot the error that still exists?<br \/><br \/>26. In row 589, Juliann Rinehart\u2019s last name is duplicated, because it was typed into the original first_name and last_name columns.\u00a0 To fix this, we can hover over the cell and click the Edit button and adjust the name to Rinehart.<\/p>\r\n<p>\r\n\r\n<div class=\"wp-block-image\"><\/p>\r\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-136\" src=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-011809.png\" alt=\"\" width=\"674\" height=\"113\" srcset=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-011809.png 825w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-011809-300x50.png 300w, https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/Screenshot-2023-04-06-011809-768x128.png 768w\" sizes=\"auto, (max-width: 674px) 100vw, 674px\" \/><\/figure>\r\n<p><\/div>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">This just goes to show that it is worth checking your data for additional errors, even if you think you caught everything.\u00a0 And, it proves to us that data involves decision-making; forms and data are not necessarily always clean, but we often must decide how to represent what we believe to be correct based on the context.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p style=\"padding-left: 40px\">27. Now, we are ready to export our clean data!\u00a0 Head over to the Export button on the upper right side of the screen and select your desired format, likely either Excel or Comma-separated value.\u00a0 Your project will save into your Downloads folder and it is ready to build visualizations with or to share! Below is a link to the cleaned data if you would like to compare your own data.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<div class=\"wp-block-file\" style=\"padding-left: 80px\"><a id=\"wp-block-file--media-87a35f7b-cc95-43f6-ab80-5462c4bd943e\" href=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/OpenRefine_Workshop_Example.xls\">OpenRefine_Workshop_Example<\/a><a class=\"wp-block-file__button wp-element-button\" href=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/files\/2023\/04\/OpenRefine_Workshop_Example.xls\" aria-describedby=\"wp-block-file--media-87a35f7b-cc95-43f6-ab80-5462c4bd943e\">Download<\/a><\/div>\r\n<p>\r\n\r\n<\/p>\r\n<h2 class=\"wp-block-heading\">Closing Discussion<\/h2>\r\n<p>\r\n\r\n<\/p>\r\n<p>OpenRefine may be very new, and often, learning a new skill or software takes time.\u00a0 Still, compare this process to the time it would have likely taken to individually comb through the data and correct every mistake individually!\u00a0 While we covered some techniques that require a little bit of background in Python, this is meant to serve as a gateway to explore OpenRefine.\u00a0 In the\u00a0 first few examples of the lab, we see that not everything in OpenRefine requires a background in coding.\u00a0 As we come back together, we invite you to share any reflections with the larger group:<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<ul class=\"wp-block-list\">\r\n<li>How does OpenRefine compare to your past methods of cleaning data?<\/li>\r\n\r\n\r\n\r\n<li>Are there any particular aspects of OpenRefine that would be especially useful to you in your role?<\/li>\r\n\r\n\r\n\r\n<li>Based on the previous workshop in this series, is there a way that we might avoid having to do all this data cleaning in the first place?<\/li>\r\n<\/ul>\r\n<p>\r\n\r\n<\/p>\r\n<p>We know that learning new software often is frustrating and a very gradual process, but you\u2019ve reached the end of this workshop!\u00a0 We\u2019d like to thank you for your patience and participation with us in this journey!<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<p>Congratulations on completing this workshop! Please consider giving us feedback in our survey linked in the button below so that we can continue to improve our workshops.<\/p>\r\n<p>\r\n\r\n<\/p>\r\n<div class=\"wp-block-buttons is-content-justification-right is-layout-flex wp-container-core-buttons-is-layout-765c4724 wp-block-buttons-is-layout-flex\">\r\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link has-white-color has-text-color has-background wp-element-button\" style=\"background-color: #0c2340\" href=\"https:\/\/docs.google.com\/forms\/d\/e\/1FAIpQLSdue4I_igcGBpCKuEx-KIuNT27mpoP0OqaoWS8WwQQWzxa0Bg\/viewform?usp=sf_link\" target=\"_blank\" rel=\"noreferrer noopener\">Feedback Survey<\/a><\/div>\r\n\r\n\r\n\r\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link has-white-color has-text-color has-background wp-element-button\" style=\"background-color: #0c2340\" href=\"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/measuring-impact\/how-to-analyze-visually-represent-data\/\">How to Analyze &amp; Visually Represent Data \u2192<\/a><\/div>\r\n<\/div>\r\n<p><\/p>","protected":false},"excerpt":{"rendered":"<p>Overview Pre-Workshop &amp; Required Materials This workshop requires some additional resources.\u00a0 We will be working alongside one another to tidy a dataset and use OpenRefine, but first we have to ensure OpenRefine and our data is downloaded.\u00a0 This dataset is meant to emulate possible contact information for a nonprofit organization.\u00a0 See the directions below: When [&hellip;]<\/p>\n","protected":false},"author":4527,"featured_media":0,"parent":24,"menu_order":1,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-34","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/wp-json\/wp\/v2\/pages\/34","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/wp-json\/wp\/v2\/users\/4527"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/wp-json\/wp\/v2\/comments?post=34"}],"version-history":[{"count":20,"href":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/wp-json\/wp\/v2\/pages\/34\/revisions"}],"predecessor-version":[{"id":150,"href":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/wp-json\/wp\/v2\/pages\/34\/revisions\/150"}],"up":[{"embeddable":true,"href":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/wp-json\/wp\/v2\/pages\/24"}],"wp:attachment":[{"href":"https:\/\/sites.nd.edu\/expanding-nonprofit-data-analysis\/wp-json\/wp\/v2\/media?parent=34"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}