Big Data: It’s Everywhere

Free ebooks Library zlib project

Big data has changed our world, and it’s also changed how the Michigan Program in Survey and Data Science looks at survey methods and survey methodology.

Data is everywhere, and it plays a profound role in our everyday lives. It’s assimilated and analyzed for insights into human behavior, and experts — across countless fields — work to take its findings to make better decisions and strategic moves.

In this question and answer session, Sunghee Lee, a research associate professor, and Frederick Conrad, director of the Michigan Program in Survey and Data Science, share their expertise on survey methodology, data science, and big data tools. They also discuss how survey methodology and data science has changed their approaches to research and teaching within the Michigan Program in Survey and Data Science.

Understanding Big Data: Q & A With Fred Conrad and Sunghee Lee from the Michigan Program for Survey and Data Science

We hear the term in our everyday lives, but what is big data?

Lee: Big data is a subjective term. As the field gets used to certain data size, what is big data today may not be big tomorrow. Having said that, data that is automatically or passively collected with little engagement — or awareness — from the subjects is generally big data. A couple of examples would include Internet browsing history and posts on social media. As more people shop online and use social media — and the ways to capture such information grow — the amount of data will become larger by the minute.

Conrad: When we discuss what is big data, we tend to talk about organic or found — rather than big data — to reflect the idea that this data was not designed to be used in social research but can be repurposed for this kind of work. Survey data is designed in the sense that, once the questionnaire is developed, the structure of the data set is pretty much determined. But organic data, not being designed, is of unknown structure ahead of time.

I also think an important part of big data is that it indirectly measures what we’re interested in, requiring researchers to infer their meaning. For example, if you ask a survey respondent to rate their stress right now, their answers mean what we think they do. But, if you use heartrate to measure stress — a type of big or organic data — we must differentiate spikes in heartrate due to exercise, caffeine, or psychological stress, and we typically cannot do that through survey research.

How has big data — and big data tools — changed survey methods and survey methodology?

Lee: It elevated how we think about data. Survey researchers are used to data designed for specific research goals or questions and collected under researchers’ control. Big data is not necessarily born out of research needs and may be available at a much lower cost than the traditional survey data. There is no reason not to investigate its utility for improving our ability to advance science.

Conrad: I agree. Big data has pushed us to think about survey data in a larger context and to appreciate the strengths and weaknesses of what we have traditionally done. Big data tools accumulate big data quickly compared to even the fastest surveys. It may focus us on tracking change over time more than producing cross sectional point estimates.

Rather than reporting that a percentage of individuals believe something be true, big data almost forces us to talk about the comparisons to the previous collection of data. In the case of social media, it may provide richness that responses to survey questions can lack. Surveys allow us to investigate whatever issue we want, not just a collection of search strings or social media posts.

How has big data enhanced or improved survey methods and survey methodology?

Conrad: The arrival of big data — and big data tools — has led survey methodologists to think of ourselves as data quality specialists. Surveys are all about estimating some attribute of the population as accurately as we can, but most users of big data really don’t think about the accuracy of the estimates — assuming that because the data set is big, it’s credible. But survey researchers recognize how biases in who provides the data can hurt the accuracy of the estimates from any data source, including big data.

Lee: There is a great deal of effort to make use out of big data as complementary or a tool for surveys for cost reasons. Moreover, it pushed the survey methodology field to embrace more advanced statistical techniques, particularly those that require large statistical power and/or improve prediction.

How do you ensure data quality in the era of big data?

Lee: While seeing this new type of data emerging so quickly and drawing the public’s attention, it is uncertain who this data represents and what this data means. These are the data science trends and quality issues that survey methodologists have focused on and that data scientists have turned their attention to.

Conrad: As I just mentioned, the arrival of big data — and its unquestioned use — has led survey methodologists to extend error frameworks developed for surveys to other kinds of data. For example, if tweets cannot be longer than 280 characters, it won’t be possible to express some ideas on Twitter. That introduces a type of error, such as reducing the quality of data culled from Twitter.

You’re both not only researchers in the field, you’re educators, too. When you look at the Michigan Program in Survey and Data Science, what makes your program stand out when it comes to survey methods and its intersection with data science?

Lee: We look at the data from the perspective that there is no such thing as error-free data. Our mission is to improve data quality by reducing errors, in particular, errors related to representation and measurement properties of the data.

Conrad: We emphasize the powerful tools — theoretical and analytic — developed for survey research, and we teach students how to extend these skills to new data sources. Data science programs do teach students how to work with big data and big data tools, but not survey data — at least in the depth we do.

Pursue Your Passion: Earn a Degree in Survey and Data Science

Nearly 100% of U-M graduates over the past two decades are employed in survey and data science careers or are seeking advanced education. You, too, can pursue a career that drives decisions through our survey and data science master’s degree or PhD programs.

Learn more about the Michigan Program in Survey and Data Science

View Potential Career Options in Survey and Data Science

Discover Your Return on Investment on a Master’s Degree in Survey and Data Science

Learn more about what is big data and its intersection with survey methodology

Our alumni work in nearly every industry, providing information to decision makers in order to understand opinions, preferences, beliefs, and desires. Learn more about the current outlook, as well as salaries, in survey and data science.

Pursuing a career in survey and data science requires specialized skills and experience. Learn more about Michigan’s curriculum for survey and data science graduate students.

Thinking about a survey and data science career? Earning a master’s degree will open doors to more job opportunities, provide real-world experience, and boost your earning potential. Find out more. Here are 10 reasons why a master’s in survey and data science may be right for you.