Delivery Format
Now that we've explained the dataset structure, we can go over format in which this data is delivered.
File Format¶
Due to the high number of variables, this is a large dataset. Only subsets of variables or geographies will be able to be opened in applications like Excel, and those opening this data in more advanced statistical or programming environments will often need to manage memory, especially when running on a local machine.
That said, this data is typically delivered in .csv format with each row representing a single individual or geography and each column representing a FollowGraph variable. Our standard geographic file is at the census block group level. If you are interested in other formats (H3 or census tract being some examples...), don't hesitate to ask your Spatial.ai representative.
The below table is an example table with 10 random selected variables (you can scroll the table right to see the whole thing) at a census block group level.
BLOCKGROUP | Boy Band Fans | Budget Fashion Enthusiasts | Business Readers | Dalai Lama | Daniel Dale | Olive Garden | Productivity Apps Enthusiasts | Rachael Ray Show | Soccer Enthusiasts | Cardi B |
---|---|---|---|---|---|---|---|---|---|---|
010010201001 | 119 | 141 | 60 | 74 | 60 | 146 | 58 | 126 | 85 | 92 |
010010201002 | 106 | 104 | 68 | 81 | 78 | 116 | 78 | 109 | 107 | 81 |
010010202001 | 123 | 165 | 76 | 83 | 76 | 151 | 71 | 125 | 92 | 149 |
010010202002 | 111 | 139 | 73 | 80 | 70 | 140 | 72 | 122 | 88 | 123 |
010010203001 | 105 | 125 | 72 | 78 | 75 | 131 | 75 | 120 | 93 | 104 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
All scores provided are index scores where 100 represents the national average. The first value in the first row is 119, indicating that the people in block group 010010201001 are 1.19 times as likely to follow Boy Bands on social media. This can also be stated as "19% more likely". This is further explained in the Dataset Structure section.
Your data file may include columns IDs and names (for the column headers). The "Key" file contains the full variable listing and you can use the IDs to match the data and key files. The key file contains a list of all the variables including names, descriptions, categorizations, and the social media accounts which each variable is based on.
Individual Level Data Delivery¶
For individual level data, the data looks different in the following ways:
- The BLOCKGROUP column would simply be replaced by a Customer ID or similar ID field.
- An additional column Match Level is included which details the level at which we have matched an individual. Possible values include (but are not limited to) Name, Address, Block Group, and Zip Code.
There's more info we'd like to emphasize on this, however, so we've devoted this entire documentation section to individual level data delivery.