Skip to content

Delivery Format

Now that we've explained the dataset structure, we can go over format in which this data is delivered.

File Format

Due to the high number of variables, this is a large dataset. Only subsets of variables or geographies will be able to be opened in applications like Excel, and those opening this data in more advanced statistical or programming environments may need to manage memory.

That said, this data is typically delivered in .csv or .parquet format with each row representing a single geography and each column representing a FollowGraph variable. Our standard geographic file is at the census block group level.

The below table is an example table with 10 random selected variables (you can scroll the table right to see the whole thing) at a census block group level.

BLOCKGROUP Boy Band Fans Budget Fashion Enthusiasts Business Readers Dalai Lama Daniel Dale Olive Garden Productivity Apps Enthusiasts Rachael Ray Show Soccer Enthusiasts Cardi B
010010201001 119 141 60 74 60 146 58 126 85 92
010010201002 106 104 68 81 78 116 78 109 107 81
010010202001 123 165 76 83 76 151 71 125 92 149
010010202002 111 139 73 80 70 140 72 122 88 123
010010203001 105 125 72 78 75 131 75 120 93 104
... ... ... ... ... ... ... ... ... ... ...

All scores provided are index scores where 100 represents the national average. The first value in the first row is 119, indicating that the people in block group 010010201001 are 1.19 times as likely to follow Boy Bands on social media. This can also be stated as "19% more likely". This is further explained in the Dataset Structure section.

Your data file may include columns IDs instead of names (for the column headers). The "Key" file contains the full variable listing and you can use the IDs to match the data and key files. The key file contains a list of all the variables including names, descriptions, categorizations, and the social media accounts which each variable is based on.

Multiple Files

This dataset has thousands of variables. Depending on use case, we may deliver the data multiple files. Typically, we would split along variable sections (i.e. one file for "Interests", one file for "Celebrities & Influencers", etc.).


Back to top