Data & Statistics Research Guide: Getting Started

This guide is a collection of all of the data and statistics-related research guides.

What are data?

Data, primarily gathered from surveys, interviews, experiments, census or administrative records, are unprocessed numeric information that allows users to work interactively with numbers at the variable level and conduct quantitative reasoning for their research arguments.  Users draw conclusions based upon the results of their data analysis.

In most data, the variables are represented by columns and the records or cases are represented by rows. Data is meant to be machine-readable, which means that in order for humans to understand the data, we need proper documentation like a codebook or data dictionary. Codebooks and data dictionaries outline the codes and values used to label observations. For example, there may be a variable for Gender in the dataset where '1' represents Male and '2' represents Female. This information would be listed in a codebook so that we can read and interpret the data in the Gender variable.

An example of data in SPSSSource: NYU Data Services - Introduction to SPSS

Data come in many different file formats. Some of these include: 

* .json files - syntax for storing and exchanging data

* .dat or * .txt - ASCII data file

*.csv or *.tsv - Comma-separated values file or tab-separated values file (tab-delimited plain text file)

* .sd2 or * .stc or * .xpt - SAS data file

* .sav - SPSS/PASW data file

* .por - SPSS/PASW portable data file

* .dta - Stata data file

* .sas - SAS program file (or called SAS setup file or SAS syntax file)

* .sps - SPSS program file (or called SPSS setup file or SPSS syntax file)

*.do - Stata program file (or called Stata setup file or Stata syntax file)


What are statistics?

Statistics are processed information obtained through mathematical calculations of the raw data.  They are often quick facts and figures presented in tables and charts without giving users much freedom to customize and calculate as they wish. 

In other words, statistics summarize the data. Examples of statistics would be graphical representations (like charts and tables) commonly seen in articles and popular media.

Example of statistics.

Source: Netflix in Statista

Learn more about Data & Statistics

