{"id":553,"date":"2021-08-12T13:27:02","date_gmt":"2021-08-12T13:27:02","guid":{"rendered":"https:\/\/sites.rutgers.edu\/joann-ordille\/?p=553"},"modified":"2021-08-12T13:32:42","modified_gmt":"2021-08-12T13:32:42","slug":"the-data-in-data-science","status":"publish","type":"post","link":"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/","title":{"rendered":"The Data in Data Science"},"content":{"rendered":"<p>Data science starts with the data, and there are several steps to preparing data for exploration, hypothesis formation, analysis and visualization.\u00a0 First, data must be acquired.\u00a0 It may be available in your company\u2019s databases, or you may need to find it in public or private data sources.\u00a0 Once obtained, \u00a0the data is managed in a system, sometimes as simple as a spreadsheet but often a database system.\u00a0 It is evaluated for correctness, completeness and useability.\u00a0 Perhaps the source is authoritative and known to be correct.\u00a0 On the other hand, multiple data sources might be compared for consistency\u00a0 as a measure of likely correctness.\u00a0 Is data missing from the source?\u00a0 If it\u2019s about universities, are all the universities represented?\u00a0 The data acquired might not be useable yet, because its in the wrong format or inconsistent.\u00a0 For example, the measurements in the data may be in different units than those needed, or the names used to describe people or objects may differ.\u00a0 Height might be in feet or inches or feet and inches.\u00a0 Rutgers might be Rutgers University or simply Rutgers.\u00a0 Data is placed in the needed format and made consistent in a process called cleaning or wrangling.<\/p>\n<p>Once cleaned, your data can be analyzed.\u00a0 You might want to explore it a bit with queries or data visualizations to get a feel for what it says.\u00a0 Simple summarization or advanced hypothesis formation and testing can be done to gain insight into the data.<\/p>\n<p>In the next posts, we will discuss obtaining and readying data for further exploration, analysis and visualization.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Data science starts with the data, and there are several steps to preparing data for exploration, hypothesis formation, analysis and visualization.\u00a0 First, data must be acquired.\u00a0 It may be available &hellip; <a href=\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/\" class=\"\">Read More<\/a><\/p>\n","protected":false},"author":1815,"featured_media":555,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[9,10],"tags":[],"class_list":["post-553","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science","category-databases"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>The Data in Data Science - Joann J. Ordille<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"The Data in Data Science - Joann J. Ordille\" \/>\n<meta property=\"og:description\" content=\"Data science starts with the data, and there are several steps to preparing data for exploration, hypothesis formation, analysis and visualization.\u00a0 First, data must be acquired.\u00a0 It may be available &hellip; Read More\" \/>\n<meta property=\"og:url\" content=\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/\" \/>\n<meta property=\"og:site_name\" content=\"Joann J. Ordille\" \/>\n<meta property=\"article:published_time\" content=\"2021-08-12T13:27:02+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-08-12T13:32:42+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-content\/uploads\/sites\/706\/2021\/08\/DataSciencePipeline.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1558\" \/>\n\t<meta property=\"og:image:height\" content=\"1173\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Joann Ordille\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Joann Ordille\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/\",\"url\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/\",\"name\":\"The Data in Data Science - Joann J. Ordille\",\"isPartOf\":{\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-content\/uploads\/sites\/706\/2021\/08\/DataSciencePipeline.png\",\"datePublished\":\"2021-08-12T13:27:02+00:00\",\"dateModified\":\"2021-08-12T13:32:42+00:00\",\"author\":{\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/#\/schema\/person\/d086d5745ec793e70d0d011dbedf370a\"},\"breadcrumb\":{\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/#primaryimage\",\"url\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-content\/uploads\/sites\/706\/2021\/08\/DataSciencePipeline.png\",\"contentUrl\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-content\/uploads\/sites\/706\/2021\/08\/DataSciencePipeline.png\",\"width\":1558,\"height\":1173,\"caption\":\"The Data Science Pipeline\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"The Data in Data Science\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/#website\",\"url\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/\",\"name\":\"Joann J. Ordille\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/#\/schema\/person\/d086d5745ec793e70d0d011dbedf370a\",\"name\":\"Joann Ordille\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/ffe56d4671b3db40ac4b317a390c83b5?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/ffe56d4671b3db40ac4b317a390c83b5?s=96&d=mm&r=g\",\"caption\":\"Joann Ordille\"},\"url\":\"https:\/\/sites.rutgers.edu\/joann-ordille\/author\/jo531\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"The Data in Data Science - Joann J. Ordille","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/","og_locale":"en_US","og_type":"article","og_title":"The Data in Data Science - Joann J. Ordille","og_description":"Data science starts with the data, and there are several steps to preparing data for exploration, hypothesis formation, analysis and visualization.\u00a0 First, data must be acquired.\u00a0 It may be available &hellip; Read More","og_url":"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/","og_site_name":"Joann J. Ordille","article_published_time":"2021-08-12T13:27:02+00:00","article_modified_time":"2021-08-12T13:32:42+00:00","og_image":[{"width":1558,"height":1173,"url":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-content\/uploads\/sites\/706\/2021\/08\/DataSciencePipeline.png","type":"image\/png"}],"author":"Joann Ordille","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Joann Ordille","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/","url":"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/","name":"The Data in Data Science - Joann J. Ordille","isPartOf":{"@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/#website"},"primaryImageOfPage":{"@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/#primaryimage"},"image":{"@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/#primaryimage"},"thumbnailUrl":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-content\/uploads\/sites\/706\/2021\/08\/DataSciencePipeline.png","datePublished":"2021-08-12T13:27:02+00:00","dateModified":"2021-08-12T13:32:42+00:00","author":{"@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/#\/schema\/person\/d086d5745ec793e70d0d011dbedf370a"},"breadcrumb":{"@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/#primaryimage","url":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-content\/uploads\/sites\/706\/2021\/08\/DataSciencePipeline.png","contentUrl":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-content\/uploads\/sites\/706\/2021\/08\/DataSciencePipeline.png","width":1558,"height":1173,"caption":"The Data Science Pipeline"},{"@type":"BreadcrumbList","@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/the-data-in-data-science\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/sites.rutgers.edu\/joann-ordille\/"},{"@type":"ListItem","position":2,"name":"The Data in Data Science"}]},{"@type":"WebSite","@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/#website","url":"https:\/\/sites.rutgers.edu\/joann-ordille\/","name":"Joann J. Ordille","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/sites.rutgers.edu\/joann-ordille\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/#\/schema\/person\/d086d5745ec793e70d0d011dbedf370a","name":"Joann Ordille","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/sites.rutgers.edu\/joann-ordille\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/ffe56d4671b3db40ac4b317a390c83b5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/ffe56d4671b3db40ac4b317a390c83b5?s=96&d=mm&r=g","caption":"Joann Ordille"},"url":"https:\/\/sites.rutgers.edu\/joann-ordille\/author\/jo531\/"}]}},"_links":{"self":[{"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/posts\/553"}],"collection":[{"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/users\/1815"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/comments?post=553"}],"version-history":[{"count":1,"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/posts\/553\/revisions"}],"predecessor-version":[{"id":554,"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/posts\/553\/revisions\/554"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/media\/555"}],"wp:attachment":[{"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/media?parent=553"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/categories?post=553"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sites.rutgers.edu\/joann-ordille\/wp-json\/wp\/v2\/tags?post=553"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}