• Skip to primary navigation
  • Skip to content
  • Skip to primary sidebar
  • Skip to footer

Emsi

Data Works

  • Our data
    • Emsi Data
    • Emsi Skills
    • Emsi API
  • What we do
    • Higher Education
    • Enterprise and Staffing
    • Economic and Workforce Development
  • Who we are
    • Company
    • Careers
    • Conference
  • Resources
    • Blog
    • Case Studies
    • Research
  • Login
  • Contact

EMSI FAQ: Where Does EMSI Data Come From?

October 15, 2014 By Joshua Wright Leave a Comment

Introductionemsi faq-08

EMSI provides a composite dataset that integrates over 90 federal and state labor market data sources into one robust database. This is the foundational dataset that drives our suite of products and services. EMSI data provides insight into regional economies through its look at industries, occupations, postsecondary training programs, demographics, wages, and more. We provide this data at the state, county, and metro area levels, with ZIP code estimates available for core data (employment, earnings, demographics). We update all of this data four times per year.

In this article, we’ll provide a brief explanation of the sources we use, how we deal with suppressions, and how we link disparate datasets. To dig deeper on EMSI data and where it comes from, we encourage you to participate in the EMSI Certification Program.

Sources

EMSI starts by downloading data from government sources like the Bureau of Economic Analysis (BEA), the U.S. Census Bureau, the Bureau of Labor Statistics, and others. From these sources come particular datasets (see our complete list).

When we first receive these datasets, they are large, they show data for different geographies, they may show us ranges rather than specific numbers, or they may have varying levels of detail. The biggest hurdle we run into are suppressions.

Suppressions

Data suppressions are how EMSI refers to data points that are non-disclosed when we receive them. Suppressions are created by the government organizations that publish the data products in order for them to comply with various laws and regulations that are in place to help protect the privacy of the businesses that report to them. These datasets are published by these government organizations primarily for statistical purposes.

Think of it like a Sudoku puzzle. There are some numbers showing, but there are also a lot of empty cells. It is at this point that EMSI’s sophisticated algorithms make it possible for us to replace these suppressions with mathematically educated estimates. We use numbers that we know from one set to inform our estimate for another set. We do this many times for each geographic area. In the end, we have to do this at the national level down to the ZIP code level. And we update them every quarter. These algorithms have taken EMSI years to develop.

Connecting the Data

After we have gathered the data and gone through our suppression process, it’s time to connect the data. A large part of our work uncovering suppressions happens when we compile industry data, which covers 1,100 detailed industries for every county and ZIP code in the U.S. After we have completed that portion of the process, we run the data through a staffing pattern, which gives us an idea of what the occupation distribution is across those industries. Many industries, from hospitals to doctor’s offices to your local school, might employ a nurse, for example. We use these staffing patterns to discover how these jobs are distributed. In the end, we connect these 1,100 industries to over 800 occupations. The final step is connecting the occupation information to training programs that are tied to them, which we are able to do using education completion data from the National Center for Education Statistics.

Summary

Put simply, EMSI data comes from a bunch of government sources. We take all of these different sources and use the strengths of one set to overcome with the weaknesses of another until we have one comprehensive, consistent, and complete dataset.

For more on EMSI data, contact Josh Wright or see our data page. Follow EMSI on Twitter (@DesktopEcon), and check us out on LinkedIn and Facebook. Be sure to sign up for our newsletter, too.

Joshua Wright

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

See how other organizations are using Emsi

Sort by

  • Higher Education
  • Economic Development
  • Workforce Development
  • Talent Acquisition

Older posts

Use Emsi in your work

Receive regular updates

Share

Footer

Services

  • Higher Education
  • Economic Development
  • Workforce Development
  • Enterprise

About

  • Company
  • Data
  • Privacy Policy
  • Careers
  • Conference

Country

  • United States
  • United Kingdom
  • Canada
  • Australia
Logo Image
  • An affiliate of
    Strada Education Network
  • 208-883-3500

    Emsi uses cookies to improve site functionality and provide a better browsing experience. By using the site or clicking "Ok", you consent to the use of cookies.Ok