Methodology
Overview¶
FollowGraph is derived from analysis of social media following data from public social media accounts. Every variable (with the exception of the "Interests" variables) is individually modeled with the goal of predicting the likelihood that any individual (or aggregate individuals in an area) follow a given account or group of accounts.
Predicted following percentage (aka likelihood) is then indexed to the national average (see the Index Scores section for more info about index scores)
Sources¶
The following datasets were used in the creation of FollowGraph variables:
- Social Media Following - millions of lists of the account's people follow
- Mobile Visitation - cell phone movement/visitation aggregated at the census block group level
- Demographics
- Census demographics
- Individual and household level variables
Coverage¶
FollowGraph is available in entire USA (all 50 states). The dataset has over 99% coverage of US census block groups. Further, the dataset is backed by a database of approximately 281 million records to match individuals across the USA.
Regionality and Local-ness¶
Some variables are significantly affected by geography and regionality. Examples of these types of variables include sports teams (Cincinnati Bengals, Dallas Mavericks, etc.), local politicians, and more.
For every variable in our dataset, we calculate a score of how "local" or regional that variable is. Variables that are above a certain threshold in their "regionality" or "locality" score include special variables in the modeling stage to account for their relationship to geography and location.
Interest Variables¶
Variables categorized as "Interests" are a special case (see the categorization section for more info on categorization). Interest variables are generated slightly differently.
Interest variables are calculated using data from several social media accounts. They are based on themes that are valuable to analysts and marketers to understand. Here are a couple of example indexes:
- UFC & MMA Fans - modeled using the social media accounts of UFC, Dana White, Ronda Rousey, and Conor McGregor
- Adult Cartoon Watchers - Adult Swim, Family Guy, Rick & Morty, and South Park
- Politically Progressive - AOC, Elizabeth Warren, and Sunrise Movement
The key file that comes with the FollowGraph specifies the accounts that make up each Interest variable.
There are some Interest variables that are made up of only one account. These variables will have the exact same values as the individual variables which correspond to them. Many of these variables are related to specific brands (especially car brands) and are generated for use on external advertising platforms.