Why we shouldn’t know our own passwords

Here’s a little article I wrote for¬†The Conversation about why current border security policies make it important that we develop technology so that we cannot reveal our own¬†passwords. This is an online publication where the articles are written by academics, then released under a Creative Commons license so they can be re-published elsewhere. So far the most views have come from Time and Alternet. Pretty interesting.

Update: The local Greensboro paper ran the article too, and on April 5, I’m going to be on a radio show talking about this stuff. It’s so weird that one little cybersecurity article can be this interesting to people, meanwhile I’ve toiled away on free software for years and no one cares ūüôā — I seriously need to change fields.

Greensboro News&Record

Found and liberated 2,151 missing DHS files

On January 18, 2017 the US Department of Homeland Security discontinued its Daily Open Source Infrastructure Report service which it had run since October 2006. To enable researchers to study the content of these reports, I collected as many as I could find (2,151 PDF files) and released them to the Internet Archive. You can find them here: DHS Daily Open Source Infrastructure Reports 2006-2017

The PDF files came from the following URLs:

  • https://www.dhs.gov/sites/default/files/publications/
  • https://www.dhs.gov/sites/default/files/publications/nppd/ip/daily-report/
  • https://www.dhs.gov/xlibrary/assets/

And when these yielded 404 errors (which they did for most pre-2013 files) I used the Internet Archive itself, with the following URL base:


Files are named as they were upon download, in one of the following patterns:

  • DHS_Daily_Report_2006-10-11.pdf (most 2006-2012 files have this format)
  • DHS-Daily-Report-2012-12-06.pdf (a single December 2012 file has this format)
  • dhs-daily-report-2013-01-09 (most 2013-2017 files have this format)

If you are interested in missing dates (for example Archive.org was missing some dates and a few files were corrupted), this blog might be able to help fill in the gaps.


Changes to Professor Watchlist – who was removed?

Nearly 12,000 professors have used the AAUP’s “Add My Name” feature in order¬†to be added to Turning Point USA’s Professor Watchlist, and large groups of faculty from Trinity University and University of Notre Dame, among others, have also requested to join. The Professor Watchlist was created in order to expose professors who “advance a radical agenda in lecture halls” and inclusion on the list¬†is supposed to be based on “incidents that have already been reported by a credible source.”

How has the list changed?

The Professor Watchlist debuted in November with 146 names, and has grown to 166 names as of January 3, 2016. I was curious who has been added (obviously not all 12,000 who requested to be added!), and even more curious about who has been taken off the list.

So I created a¬†Google Spreadsheet showing the names in November and the¬†names as they show up in January. I got the November list from archive.org’s Wayback Machine, and the current list from the Professor Watchlist website.

Google Spreadsheet showing additions & removals

Data cleaning steps

  • I re-alphabetized 4 names that were out of order on the November list
  • I lined up the names so we could more easily see who was added/removed
  • I colorized each name with red if it was removed since November, and green if it was added since November.


Unfortunately, the Wayback Machine does not have the original PW pages indexed for each individual professor, so I can’t go back and see what they wrote for each, but based on what I was able to find online, the rationales for¬†these seven seem very flimsy.

Anyway, at a rate of only 35 changes in a month, Turning Point is going to hire some more interns to enter all these 12,000 names! Good luck with that.

Linguistic analysis of a local neo-confederate group’s Facebook posts

Earlier I showed how to extract the¬†postings from a given Facebook page. Here, I¬†will show you how to do some basic text mining on the posts you found. For practice, I will use the messages of a local¬†neo-Confederate group called ACTBAC (“Alamance County Taking Back Alamance County”).¬†Their¬†antics¬†have been covered in local media, but with their¬†re-branding¬†in light of the Trump election and the rise of the alt-right, many people in our area are still wondering just what this group is all about. Perhaps text mining can help illuminate some of their beliefs and strategies¬†for us.


I ran the script on their “ALAMANCEOURS” Facebook page, and it yielded 1017 messages beginning in June, 2015. Here is the spreadsheet (actbac.csv) in case anyone wants to play around with it.

Top 50 words most used in their FB posts

I wrote a program to count frequencies and remove¬†stopwords (stopwords are boring words like ‘a’, ‘to’, ‘it’, ‘is’). Then I highlighted the most interesting words (to me) in yellow. Each word is shown with its count next to it.

From these, we can see many predictable words for a county-based neo-Confederate group (county, state, southern, cause, carolina). However,¬†I was most intrigued by the prominence of the¬†word ‘stand’.

Usage of the word ‘stand’

Stand¬†can be both a noun (“take a stand”) and a verb (“stand up for yourself”). With this group,¬†‘stand’¬†is the most common verb¬†used in their messages (not counting stopwords like ‘be’ or ‘is’). My hypothesis is that, as a verb, this word ‘stand’ conveys a lot of the power of their movement. Why?

To help understand how they use ‘stand’,¬†I wrote a program to generate¬†a concordance to show how¬†the word is used in their messages. The first few lines¬†of the concordance look like this:

The word of interest (shown in red) is placed in the center of each line. The concordance then shows each collection of words around that word.

From this, I learned that the word ‘stand’ is used 291 times in 1017 messages, most commonly as follows:

In addition, there are another 41 uses of “stood” and 86 uses of “standing”.

It would be interesting to compare this usage to other Confederate and non-Confederate groups to see whether this is a uniquely ACTBAC thing (I doubt it), or – more likely –¬†it is a rhetorical device used more broadly by all Confederate¬†groups. I would guess that their defensive “stand up for your beliefs, no matter how unpopular” plea¬†has¬†great¬†power in a neo-Confederate setting. After all,¬†the “Lost Cause” narrative¬†also describes¬†a heroic, virtuous South fighting against all odds, and ultimately unfairly defeated in the American Civil War.

Topic modeling

Next, just for fun, I wrote a program to¬†build a topic model of¬†the postings. A topic model¬†tells us¬†what words frequently co-occur in sentences, and tries to make groupings of those words into possible “topics”. Inside the program, you can fiddle with the number of topics, and the number of words generated for each.

After running a few experiments, I settled on 3 topics with 4 words each. These topics¬†weren’t terribly interesting, as you can see below, but we¬†can still learn¬†a few interesting things. First,¬†when ‘stand’ is mentioned, it is often used with ‘southern’ and ‘state’, and it seems to be ‘people’ who are doing the standing (makes sense). Additionally, the topic we could call ‘Confederate battle flag’ emerges (labeled Topic 3 below):

Text Difficulty

Finally, I looked at how difficult the text was to read. These are fairly simple analyses based on sentence structure, number of “difficult” words, and how many syllables are in the words.

The FKRE is the Flesch-Kincaid Reading Ease metric, which tells you how “easy” a document is to read, and then this number (71.55, or “fairly easy”) can be converted to a grade level metric (7th grade). I also ran an overall readability summary, which integrates several other difficulty measures in addition to FKRE. That one also puts this text at right around 6th or 7th grade.

I hope you enjoyed this quick tour of text mining – perhaps you will find some interesting techniques to use on your own projects!

Analysis of the latest 1000 Facebook posts by the Times-News (Aug-Dec)

I was playing around with some code today from Mastering Social Media Mining with Python (by Marco Bonzanini, and published by the same company that published my last two books), and I came up with this snazzy set of scripts (postGetter.py, fileParser.py) that mines the last X posts from any public Facebook page, creates a clickable FB url for each, sorts them in order of most interactions (shares + likes), and creates a spreadsheet with the results.

Here are the results when run for the last 1000 posts by the Times-News of Burlington, our local newspaper: timesNews.csv.


Not that surprising or shocking, but here goes. The last 1000 only goes back to August or so (modify the params at the top of the code to make it scrape more), but the top five posts for August-December based on interactions seem to be:

  1. The death of Tim-Bob from Graham Cinema
  2. The abduction of a middle schooler from a bus stop
  3. Kmart closing
  4. 25-minute Christmas Lights show on Maple Ridge Dr.
  5. Housing emergency at Burlington Animal Services

No election-related or weather-related items cracked the top 20.

The NC GOP & Pointy-headed Professors

The other day on This American Life, Dallas Woodhouse, executive director of the North Carolina GOP, dismissed evidence that vote fraud in the state was basically non-existent:

Don’t show me studies. Academics, I mean, a bunch of knuckleheads, pointy-headed professors. We deal in the real world.

Since I’ve done prior academic work on insults, I was very intrigued at¬†the possibility of my being simultaneously pointy-headed and knuckle-headed, living in an unreality¬†where I and my cranially-challenged colleagues¬†churn out reams of useless studies in order retain¬†Total World Domination.

As it turns out, the origin of knucklehead was a U.S. Army PR/recruitment program’s Goofus-type character (like the old Highlights magazine “Goofus and Gallant”) named R.F. Knucklehead. He was never portrayed as¬†smart, and was always making bad decisions. Here is a cartoon showing Aviation Cadet Knucklehead¬†working hard at signing a simple signature:

Incidentally, knucklehead was also the word chosen by President Obama to describe the prostitute-hiring Secret Service agent shenanigans in Cartagena.

The origin of pointy-headed was George Wallace in 1968 (good company you’re keeping there, Dallas). The word is a play on the¬†shape of an egg, as in egghead.¬†The Washington Post explains the Wallace usage:

He sneered from the campaign podium at the “long-haired men and short-skirted women” of the 1960s and derided “pointy-head college professors who can’t even park a bicycle straight.”

I wonder what happened in the bicycle parking lot between Wallace and some¬†unlucky academic. We’ll never know. But The New York Times brings back¬†the “pointy-head” quote for our times, comparing Wallace’s use of anti-intellectual populist insults¬†to Trump’s.

Chronicle of Higher Ed¬†ran an article back in 2012¬†listing additional stereotypes that politicians use to describe the hated professor. The article includes a really nice egghead pun¬†from¬†Adlai Stevenson (“Eggheads of the world, unite! You have nothing to lose but your yolks.”), who was¬†often criticized for being¬†one himself.

Interestingly, Google users seem to think differently about professors (for those wondering if these results were influenced by my login, they weren’t: this was an incognito browser window).

Anyway, I hope this little etymological excursion shows what a professor does when she hears something idiotic: we brush aside the insult and instead we ask lots of questions, look up the answers, synthesize the results into a conclusion, maybe ask additional questions, cite our sources, then teach what we learned to others.


Get rid of the Quizzstar 2016 “words of the year” app

Many friends are posting results of the Quizzstar “words of the year” app on Facebook. It generates a 2010-style word cloud of the words¬†you used on Facebook posts most frequently. To make the image, the user gives Quizzstar permission to view all their old posts, download them to Quizzstar, at which point Quizzstar¬†generates the image. Below is a screenshot of the Quizzstar web site, showing that this app is currently their #1 most popular. (They also have other apps that harvest your friends list and so on.)

Quizzstar “most used words” app, 2016

What users might not be aware of is that by installing this app in your Facebook account, you are agreeing to have your profile and posts mined in order to change and influence the advertisements that you are subsequently shown.

It’s a little hard to follow, but the relevant parts of Quizzstar’s Privacy Policy are sections 8-18 where they describe all the different ways they mash up your data with other data in order to do the advertisement dance. Third party mashups include: Google services (Youtube, Maps, Google Ad Words, Clicky, Admob, AdSense), Facebook (Social and Remarketing), StatCounter, Criteo, and Taboola.

An example of how they use your FB wall posts are mixed with this third party data is as follows (section 18),

We use the remarketing and ad technology provided by Taboola… in order to improve the relevance of the advertising presented to consumers. [This]…¬†includes technical browser and system information, details of how you used our service, such as your navigation paths the referring site, application, or service as well as might be combined with such data collected on other sources. Taboola might also use “Web Beacons” (small invisible images) to collect information. Through the use of “Web Beacons” simple actions such as the visitor traffic to the website can be pseudonymously recorded and collected.

Doesn’t that sound fun?

If you regret installing this app, here’s how to get rid of it.

On a regular device, such as a laptop or desktop machine (i.e. full screen browser):

  1. Go into privacy, and click “See more settings”

2. On the left, click “Apps”

3. Click “Show All” and hover your mouse over the errant app. Use the “X” to remove it (the Cartwheel app is shown, because I had forgotten to remove this one after an experiment last month! whoops)

Removing it on a mobile device

If you’re using a mobile device, you can remove apps by finding your profile page and click through as shown. Sorry Android users, this is an iPhone – I hope FB mobile is similar on your device!

Removing Data from Quizzstar

Go to their user history page on their site, scroll to the bottom.

See if it shows any history for you. (Mine didn’t because I never had the app, but maybe this works for you?)


The “corrupt Word docx” scam

This is an oldie but a goodie: suppose the homework paper is due on Friday at midnight, and you go to grade the papers on Sunday. You open a .docx paper from a student, and Word shows you the following messages:


What happened? Well, most of us will contact the student and say, “Hey, your file was corrupted!” and then the student says, “Oh, I’m so sorry” and returns a “correct” version. Was it a simple mistake, or did the student just scam you out of 2 extra days to work on the paper?

There are instructions online for how to purposely corrupt a file for this purpose, plenty of videos on how to do it, and there is even a whole Web site that will do the work for you: Corrupt-A-File.net (I am reluctant to link to these bottom feeders, so you’ll have to type these¬†in yourself.)

With a little work, it is possible to recover any existing text from a corrupt .docx file. Here are the steps to do so. I am on a Mac, but variations of this might work on Windows as well.

1. Download the .docx file to a safe place on your computer. I put mine in a subdirectory on my Desktop called ‘test’.

2. Find out what type of file you are really dealing with.¬†Open Terminal (in Applications | Utilities) and run the ‘file’ command.

flossmole2:Desktop megan$ cd test

flossmole2:test megan$ ls -l
-rw-r--r--@ 1 megan staff 499449 Dec 10 15:41 test.docx

flossmole2:test megan$ file test.docx
test.docx: Microsoft OOXML

3. It looks like we are dealing with an OOXML file. This is a¬†compressed XML version of a bunch of files and folders that actually make up the “single” Word document. We need¬†to uncompress it and start poking around at the folder and file structure stored within it.¬†This document from Microsoft explains the different folders and files within an OOXML file.

4. The first thing we need to do is uncompress the file. To do this, rename the file to a .zip extension (“mv oldname newname”), and then use¬†the commandline unzipper (“unzip newname”) to uncompress & extract it:

flossmole2:test megan$ mv test.docx test.docx.zip

flossmole2:test megan$ unzip test.docx.zip 
Archive: test.docx.zip
error [test.docx.zip]: missing 126 bytes in zipfile
 (attempting to process anyway)
error [test.docx.zip]: attempt to seek before beginning of zipfile
 (please check that you have transferred or created the zipfile in the
 appropriate BINARY mode and that you have compiled UnZip properly)
 (attempting to re-compensate)
 inflating: _rels/.rels 
 inflating: docProps/core.xml 
 inflating: docProps/app.xml 
 inflating: word/document.xml bad CRC 7f17798c (should be fea69872)
file #5: bad zipfile offset (local header sig): 7940
 (attempting to re-compensate)
 inflating: word/styles.xml 
 inflating: word/fontTable.xml 
 inflating: word/theme/theme1.xml 
 inflating: word/theme/_rels/theme1.xml.rels 
 inflating: word/header1.xml 
 inflating: word/footer1.xml 
 inflating: word/media/image1.png 
 inflating: word/settings.xml 
 inflating: word/_rels/document.xml.rels 
 inflating: [Content_Types].xml

Right away you’ll see that the unzipper is having trouble with the file because it has been corrupted. That’s ok, we’ll still be able to poke around and find some stuff inside it.

5. Here is the contents list of the directory now after we’ve extracted everything:

flossmole2:test megan$ ls -la
total 984
drwxr-xr-x 7 megan staff 238 Dec 10 16:36 .
drwx------+ 14 megan staff 476 Dec 10 16:29 ..
-rw-r--r--@ 1 megan staff 1999 Dec 7 18:17 [Content_Types].xml
drwxr-xr-x@ 3 megan staff 102 Dec 10 16:36 _rels
drwxr-xr-x@ 4 megan staff 136 Dec 10 16:36 docProps
-rw-r--r--@ 1 megan staff 499449 Dec 10 15:41 test.docx.zip
drwxr-xr-x@ 11 megan staff 374 Dec 10 16:36 word

6. Let’s explore down into the ‘word’ folder, since the Microsoft page page said that’s where all the interesting stuff would be. Here’s what the folder structure looks like in the Finder.

7. We will need to open the file in a text editor, or programmer’s editor. I like TextWrangler (download TextWrangler free). Right-click and tell it to open in TextWrangler:

8. You will see an error about incorrectly formatted XML and UTF8. Click past that.

9. Use the Find dialogue box to create a grep (regular expression) string like this. It will remove all the XML tags.

10. Voila! Now you can see the text. You can remove the remaining stray characters with a Find | Replace as you need to. Remember that as you go down further and further in the file, you will see more and more stray marks. This is because of the way the file was corrupted (intentionally or not).

Explainer: Vocabulary used in /r/the_donald

The New York Times recently ran a piece called “Reddit and the God Emperor of the Internet” about a pro-Trump online community called The_Donald on Reddit. The purpose of the article was to explain to¬†non-Redditors what The_Donald is, who populates this community, and some of the specialized vocabulary used by its 300,000+ members.

It’s a pretty good article, but it’s missing some important things. I’m going to expand their analysis, but first let me give you some quick backstory about how I got involved in this stuff. Back in March, during the presidential primaries, I realized could not name¬†any real live Trump supporters from my various friend circles. I could not think of a single friend in real life¬†or in social media that had mentioned liking Trump or supporting him. And yet he kept winning, so I was really confused. Who is voting for this guy? Am I being pranked? Trump voters¬†must exist, but why haven’t I met any? If they’re not talking to me, who are they talking to?

So, I decided that my social media must be an echo chamber, and decided to find¬†some Trump supporters elsewhere. At this point, I was already a¬†Reddit user, but I had mainly posted in computer science-related subreddits, and occasionally wandered¬†into the SandersForPresident subreddit. But everyone on Reddit seemed to know¬†about The_Donald, the subreddit that¬†was¬†ground zero for mocking¬†Bernie supporters, starting¬†flamewars with the Hillary “shills”, and trying to get their “spicy” Trump meme images promoted to Reddit’s¬†front page. Since I study online software development communities in my academic research, it seemed natural for me to collect data about The_Donald, just like I would do in my normal research.

The first thing I noticed about The_Donald is the highly specialized vocabulary used by the in-group. The learning curve is not terribly steep, but there is definitely a set of jargon that is used to signify belonging. The NYT article touches on some of the terms, but here are my additions:

  1. Centipede. A Trump supporter. Abbreviation: ‘pede. Origin: this video (“Pede” should not be confused with pedo, see #pizzagate)¬†Example headline:¬†Fox and Friends: “Trump leading ALL OVER” Fellow ‘Pedes ITS HAPPENING!!!
  2. Based. Adjective used to give very high praise, especially when applied to someone who is acting in support of Trump. Example:¬†PENNSYLVANIA FLIPPING RED –¬†BASED AMISH TO THE RESCUE
  3. CTR. Short for “Correct the Record”. CTR¬†is a Hillary superpac the purpose of which is to create “shill” accounts on social media. These shill accounts downvote Trump postings and upvote Hillary postings. CTR is very hated by Based Centipedes. Example:¬†You know we are winning because CTR shills are being TRIGGERED left and right. Don’t let them divide us! THEY HAVE NO POWER HERE!
  4. Plant lady, based plant lady. A nickname for Jill Stein, who The_Donald users¬†inexplicably, simultaneously both respected¬†and reviled. “Based Grandma” is another synonym. Example: “BASED PLANT LADY Jill Stein ALL BUT ENDORSES TRUMP! Trump is for PEACE!¬†
  5. Crooked. Short for Crooked Hillary, Trump’s nickname for her.
  6. Coats. When you publicly affirm your support for Trump, you are given a coat. Example: Give this man a coat! This comes from an early Trump rally where he asked security to confiscate a protestor’s coat.
  7. pol is short for /pol/ which is a¬†board on 4chan devoted to being politically incorrect. Lots of overlap between /pol/ and The_Donald. Please don’t go to /pol/. I warned you.
  8. Cucks. Short for cuckold. Refers to a person who is not a Trump supporter, especially one who “should” be, for example a man who doesn’t support Trump, or a media figure who is giving Trump a hard time. Alternate forms: cuckservative (a conservative who doesn’t follow Trump). Origin: comes from #GamerGate
  9. SJW. Short for social justice warrior. Origin: comes from #GamerGate
  10. Nimble navigator. This is a synonym for based centipede. Origin: this Youtube video describing centipedes as nimble navigators, overlaid with Trump video captures
  11. Autists. This refers to “person who has so many great computer skills, i.e. for hunting info, they must be autistic.” Alternates: weaponized autists. Example:¬†Weaponized autists on /pol/ compile 400+ page document exposing Crooked Shillary! SPREAD THIS LINK LIKE WILDFIRE, AND THEN KEEP SPREADING IT!
  12. The best. “We have the best ____ don’t we folks?” Fill in with any noun. Comes from “I have the best words” and other Trumpist use of “the best”. Example: We have the best autists, don’t we folks?
  13. Spez. /u/spez is the username of the CEO of Reddit. Roundly hated and criticized by The_Donald community. To use this in a sentence, they may¬†claim he is a ¬†– vocabulary test incoming –¬†SJW cuck whose company is set up to shill for Hillary using money from CTR. This revulsion and hatred for Spez got turned up to 11 over the Thanksgiving weekend when Spez revealed that he had abused his power as CEO of Reddit to edit the comments of some The_Donald members.
  14. High energy – very high praise. Opposite of “low energy” (as in Low Energy Jeb, Trump’s nickname for Jeb Bush during the primaries).
  15. White wolf and Silver Fox. These are nicknames for Mike Pence. Example:¬†Let’s take a moment to give our thanks to Mike Pence, the White Wolf. It wasn’t too hard to annihilate the Creepy Kaine in the debates, but it was crucial nonetheless
  16. MAGA. Perhaps obvious, but stands for Make America Great Again. This is a greeting and a goodbye, kind of like Aloha.
  17. Reeeeee and Pepe. Pepe is a cartoon frog that is used¬†by Trump supporters as a leading figure in their memes. It is also used in some white nationalist communities, but did not originate as a racist figure.¬†The sound Pepe makes in anger is “Reeeee”.
  18. Tendies. This is a tough one. Tendies is¬†short for chicken tenders. The origin of tendies¬†is very complex, but¬†supporters use it to mean an immature person, such as a whiny Bernie supporter who wants free stuff and lives in his mothers basement, eating tendies and getting “good boy points” for being nice. An example would be this headline:¬†Trump doesn’t eat mommies tendies. He eats real fried chicken from a bucket of American KFC he bought himself!
  19. MSM. The mainstream media.
  20. Red pilled. Happens to¬†people when they find out too much about the lies told to them by “normies” (normal people) and the MSM. Origin: The Matrix – Neo takes the Red Pill and finds out that he has been serving as a human battery.

Another thing that was very interesting to me was a type of initiation ritual that was developed for former Bernie supporters to¬†pledge their allegiance to Trump. The ritual goes like this: first, the “afterberner” publicly declares his support for Trump by posting on The_Donald. Next, as part of that post, the afterberner disavows Bernie Sanders. Finally, the afterberner is welcomed onto “The Trump Train” and given a coat (see #6 in the vocabulary list above).

Aside from initiating new members, The_Donald posters primarily spend their time generating memes, criticizing opponents, and sharing and commenting on links. Near the end of the general election process, some of the more weaponized autists (see #11 above) donated substantial time to working on Wikileaks, specifically in finding anti-Hillary evidence within the leaked Podesta emails. I was also working on the Wikileaks emails, so I noticed them a lot. Their presence on the DNCLeaks and WikiLeaks subreddits was definitely noticed and not always appreciated.

In a prior posting I compared some of the language and beliefs of participants on The_Donald to other online communities, such as free and open source software communities, some white supremacist online communities, and the alt-right media.

Similarities between some male-dominated online communities

The Guardian had a great article today that makes explicit many of the connections between the so-called “alt-right” and other predominantly male online movements/communities such as #Gamergate. I’d extend their analysis by adding two more¬†communities: free, libre, and open source software (FLOSS) developers, and pro-Trump communities like the_donald on Reddit. Like Gamergate and alt-right, these are male online communities that have the same predictable speaking style and culture as referenced in the Guardian article:

Prominent supporters on Twitter, in subreddits and on forums like 8Chan, developed a range of pernicious rhetorical devices and defences to distance themselves from threats to women and minorities in the industry: the targets were lying or exaggerating, they were too precious; a language of dismissal and belittlement was formed against them. Safe spaces, snowflakes, unicorns, cry bullies….¬†These techniques, forged in Gamergate, have become the standard toolset of far-right voices online.

I’ve built data sets of insults, gender stereotypes, double entendres (e.g. “that’s what she said” jokes) and so on using the 90+% male FLOSS developer community, and earlier this year I worked with a student and another colleague to build a machine learning classifier that could automatically¬†detect the abusive speech style of Linus Torvalds as compared to other Linux maintainers.

After doing this work, I think there are a few other characteristics in common between all these communities:

  • Their members¬†engage in hero worship of¬†strong, identifiable male leaders.¬†Moreover, these hero-leaders are always excused when they behave¬†badly.
    • Trump in the_donald,
    • Torvalds in FLOSS,
    • Robert E. Lee/Stonewall Jackson and a large cast of other generals and ancestors to worship in neo-confederate communities,
    • even Cernovich in the allegedly “leaderless” Gamergate
  • Their members¬†believe that meritocracy is the ideal arbiter, and that they have been victimized in the past by non-meritocratic institutions or systems. Examples of this are replete in discussions of:
    • affirmative action,
    • the obsession with illegal immigration,
    • jobs lost to foreign nations,
    • software projects that are in jeopardy¬†because standards weren’t high enough,
    • strivers” as a euphemism for white supremacy, etc.
  • Their members fetishize their hero¬†as a David up against¬†some Goliath
    • Trump is¬†King David up against the mainstream media and liberal elites,
    • GamerGate versus¬†PC culture (social justice warriors, or “SJWs”)¬†and PC journalism,
    • Neo-confederates versus Yankees/”Northern Aggression”/Lost Cause Mythology/carpetbaggers,
    • FLOSS projects against the evil proprietary software company du jour (Linus vs.¬†Microsoft, Linus vs.¬†Oracle, Linus vs.¬†Sun, Linus vs.¬†Nvidia), and the list goes on.

So yes, Guardian, it is all very predictable.