The NC GOP & Pointy-headed Professors

The other day on This American Life, Dallas Woodhouse, executive director of the North Carolina GOP, dismissed evidence that vote fraud in the state was basically non-existent:

Don’t show me studies. Academics, I mean, a bunch of knuckleheads, pointy-headed professors. We deal in the real world.

Since I’ve done prior academic work on insults, I was very intrigued at the possibility of my being simultaneously pointy-headed and knuckle-headed, living in an unreality where I and my cranially-challenged colleagues churn out reams of useless studies in order retain Total World Domination.

As it turns out, the origin of knucklehead was a U.S. Army PR/recruitment program’s Goofus-type character (like the old Highlights magazine “Goofus and Gallant”) named R.F. Knucklehead. He was never portrayed as smart, and was always making bad decisions. Here is a cartoon showing Aviation Cadet Knucklehead working hard at signing a simple signature:

Incidentally, knucklehead was also the word chosen by President Obama to describe the prostitute-hiring Secret Service agent shenanigans in Cartagena.

The origin of pointy-headed was George Wallace in 1968 (good company you’re keeping there, Dallas). The word is a play on the shape of an egg, as in eggheadThe Washington Post explains the Wallace usage:

He sneered from the campaign podium at the “long-haired men and short-skirted women” of the 1960s and derided “pointy-head college professors who can’t even park a bicycle straight.”

I wonder what happened in the bicycle parking lot between Wallace and some unlucky academic. We’ll never know. But The New York Times brings back the “pointy-head” quote for our times, comparing Wallace’s use of anti-intellectual populist insults to Trump’s.

Chronicle of Higher Ed ran an article back in 2012 listing additional stereotypes that politicians use to describe the hated professor. The article includes a really nice egghead pun from Adlai Stevenson (“Eggheads of the world, unite! You have nothing to lose but your yolks.”), who was often criticized for being one himself.

Interestingly, Google users seem to think differently about professors (for those wondering if these results were influenced by my login, they weren’t: this was an incognito browser window).

Anyway, I hope this little etymological excursion shows what a professor does when she hears something idiotic: we brush aside the insult and instead we ask lots of questions, look up the answers, synthesize the results into a conclusion, maybe ask additional questions, cite our sources, then teach what we learned to others.


Get rid of the Quizzstar 2016 “words of the year” app

Many friends are posting results of the Quizzstar “words of the year” app on Facebook. It generates a 2010-style word cloud of the words you used on Facebook posts most frequently. To make the image, the user gives Quizzstar permission to view all their old posts, download them to Quizzstar, at which point Quizzstar generates the image. Below is a screenshot of the Quizzstar web site, showing that this app is currently their #1 most popular. (They also have other apps that harvest your friends list and so on.)

Quizzstar “most used words” app, 2016

What users might not be aware of is that by installing this app in your Facebook account, you are agreeing to have your profile and posts mined in order to change and influence the advertisements that you are subsequently shown.

It’s a little hard to follow, but the relevant parts of Quizzstar’s Privacy Policy are sections 8-18 where they describe all the different ways they mash up your data with other data in order to do the advertisement dance. Third party mashups include: Google services (Youtube, Maps, Google Ad Words, Clicky, Admob, AdSense), Facebook (Social and Remarketing), StatCounter, Criteo, and Taboola.

An example of how they use your FB wall posts are mixed with this third party data is as follows (section 18),

We use the remarketing and ad technology provided by Taboola… in order to improve the relevance of the advertising presented to consumers. [This]… includes technical browser and system information, details of how you used our service, such as your navigation paths the referring site, application, or service as well as might be combined with such data collected on other sources. Taboola might also use “Web Beacons” (small invisible images) to collect information. Through the use of “Web Beacons” simple actions such as the visitor traffic to the website can be pseudonymously recorded and collected.

Doesn’t that sound fun?

If you regret installing this app, here’s how to get rid of it.

On a regular device, such as a laptop or desktop machine (i.e. full screen browser):

  1. Go into privacy, and click “See more settings”

2. On the left, click “Apps”

3. Click “Show All” and hover your mouse over the errant app. Use the “X” to remove it (the Cartwheel app is shown, because I had forgotten to remove this one after an experiment last month! whoops)

Removing it on a mobile device

If you’re using a mobile device, you can remove apps by finding your profile page and click through as shown. Sorry Android users, this is an iPhone – I hope FB mobile is similar on your device!

Removing Data from Quizzstar

Go to their user history page on their site, scroll to the bottom.

See if it shows any history for you. (Mine didn’t because I never had the app, but maybe this works for you?)


The “corrupt Word docx” scam

This is an oldie but a goodie: suppose the homework paper is due on Friday at midnight, and you go to grade the papers on Sunday. You open a .docx paper from a student, and Word shows you the following messages:


What happened? Well, most of us will contact the student and say, “Hey, your file was corrupted!” and then the student says, “Oh, I’m so sorry” and returns a “correct” version. Was it a simple mistake, or did the student just scam you out of 2 extra days to work on the paper?

There are instructions online for how to purposely corrupt a file for this purpose, plenty of videos on how to do it, and there is even a whole Web site that will do the work for you: (I am reluctant to link to these bottom feeders, so you’ll have to type these in yourself.)

With a little work, it is possible to recover any existing text from a corrupt .docx file. Here are the steps to do so. I am on a Mac, but variations of this might work on Windows as well.

1. Download the .docx file to a safe place on your computer. I put mine in a subdirectory on my Desktop called ‘test’.

2. Find out what type of file you are really dealing with. Open Terminal (in Applications | Utilities) and run the ‘file’ command.

flossmole2:Desktop megan$ cd test

flossmole2:test megan$ ls -l
-rw-r--r--@ 1 megan staff 499449 Dec 10 15:41 test.docx

flossmole2:test megan$ file test.docx
test.docx: Microsoft OOXML

3. It looks like we are dealing with an OOXML file. This is a compressed XML version of a bunch of files and folders that actually make up the “single” Word document. We need to uncompress it and start poking around at the folder and file structure stored within it. This document from Microsoft explains the different folders and files within an OOXML file.

4. The first thing we need to do is uncompress the file. To do this, rename the file to a .zip extension (“mv oldname newname”), and then use the commandline unzipper (“unzip newname”) to uncompress & extract it:

flossmole2:test megan$ mv test.docx

flossmole2:test megan$ unzip 
error []: missing 126 bytes in zipfile
 (attempting to process anyway)
error []: attempt to seek before beginning of zipfile
 (please check that you have transferred or created the zipfile in the
 appropriate BINARY mode and that you have compiled UnZip properly)
 (attempting to re-compensate)
 inflating: _rels/.rels 
 inflating: docProps/core.xml 
 inflating: docProps/app.xml 
 inflating: word/document.xml bad CRC 7f17798c (should be fea69872)
file #5: bad zipfile offset (local header sig): 7940
 (attempting to re-compensate)
 inflating: word/styles.xml 
 inflating: word/fontTable.xml 
 inflating: word/theme/theme1.xml 
 inflating: word/theme/_rels/theme1.xml.rels 
 inflating: word/header1.xml 
 inflating: word/footer1.xml 
 inflating: word/media/image1.png 
 inflating: word/settings.xml 
 inflating: word/_rels/document.xml.rels 
 inflating: [Content_Types].xml

Right away you’ll see that the unzipper is having trouble with the file because it has been corrupted. That’s ok, we’ll still be able to poke around and find some stuff inside it.

5. Here is the contents list of the directory now after we’ve extracted everything:

flossmole2:test megan$ ls -la
total 984
drwxr-xr-x 7 megan staff 238 Dec 10 16:36 .
drwx------+ 14 megan staff 476 Dec 10 16:29 ..
-rw-r--r--@ 1 megan staff 1999 Dec 7 18:17 [Content_Types].xml
drwxr-xr-x@ 3 megan staff 102 Dec 10 16:36 _rels
drwxr-xr-x@ 4 megan staff 136 Dec 10 16:36 docProps
-rw-r--r--@ 1 megan staff 499449 Dec 10 15:41
drwxr-xr-x@ 11 megan staff 374 Dec 10 16:36 word

6. Let’s explore down into the ‘word’ folder, since the Microsoft page page said that’s where all the interesting stuff would be. Here’s what the folder structure looks like in the Finder.

7. We will need to open the file in a text editor, or programmer’s editor. I like TextWrangler (download TextWrangler free). Right-click and tell it to open in TextWrangler:

8. You will see an error about incorrectly formatted XML and UTF8. Click past that.

9. Use the Find dialogue box to create a grep (regular expression) string like this. It will remove all the XML tags.

10. Voila! Now you can see the text. You can remove the remaining stray characters with a Find | Replace as you need to. Remember that as you go down further and further in the file, you will see more and more stray marks. This is because of the way the file was corrupted (intentionally or not).

Explainer: Vocabulary used in /r/the_donald

The New York Times recently ran a piece called “Reddit and the God Emperor of the Internet” about a pro-Trump online community called The_Donald on Reddit. The purpose of the article was to explain to non-Redditors what The_Donald is, who populates this community, and some of the specialized vocabulary used by its 300,000+ members.

It’s a pretty good article, but it’s missing some important things. I’m going to expand their analysis, but first let me give you some quick backstory about how I got involved in this stuff. Back in March, during the presidential primaries, I realized could not name any real live Trump supporters from my various friend circles. I could not think of a single friend in real life or in social media that had mentioned liking Trump or supporting him. And yet he kept winning, so I was really confused. Who is voting for this guy? Am I being pranked? Trump voters must exist, but why haven’t I met any? If they’re not talking to me, who are they talking to?

So, I decided that my social media must be an echo chamber, and decided to find some Trump supporters elsewhere. At this point, I was already a Reddit user, but I had mainly posted in computer science-related subreddits, and occasionally wandered into the SandersForPresident subreddit. But everyone on Reddit seemed to know about The_Donald, the subreddit that was ground zero for mocking Bernie supporters, starting flamewars with the Hillary “shills”, and trying to get their “spicy” Trump meme images promoted to Reddit’s front page. Since I study online software development communities in my academic research, it seemed natural for me to collect data about The_Donald, just like I would do in my normal research.

The first thing I noticed about The_Donald is the highly specialized vocabulary used by the in-group. The learning curve is not terribly steep, but there is definitely a set of jargon that is used to signify belonging. The NYT article touches on some of the terms, but here are my additions:

  1. Centipede. A Trump supporter. Abbreviation: ‘pede. Origin: this video (“Pede” should not be confused with pedo, see #pizzagate) Example headline: Fox and Friends: “Trump leading ALL OVER” Fellow ‘Pedes ITS HAPPENING!!!
  2. Based. Adjective used to give very high praise, especially when applied to someone who is acting in support of Trump. Example: PENNSYLVANIA FLIPPING RED – BASED AMISH TO THE RESCUE
  3. CTR. Short for “Correct the Record”. CTR is a Hillary superpac the purpose of which is to create “shill” accounts on social media. These shill accounts downvote Trump postings and upvote Hillary postings. CTR is very hated by Based Centipedes. Example: You know we are winning because CTR shills are being TRIGGERED left and right. Don’t let them divide us! THEY HAVE NO POWER HERE!
  4. Plant lady, based plant lady. A nickname for Jill Stein, who The_Donald users inexplicably, simultaneously both respected and reviled. “Based Grandma” is another synonym. Example: “BASED PLANT LADY Jill Stein ALL BUT ENDORSES TRUMP! Trump is for PEACE! 
  5. Crooked. Short for Crooked Hillary, Trump’s nickname for her.
  6. Coats. When you publicly affirm your support for Trump, you are given a coat. Example: Give this man a coat! This comes from an early Trump rally where he asked security to confiscate a protestor’s coat.
  7. pol is short for /pol/ which is a board on 4chan devoted to being politically incorrect. Lots of overlap between /pol/ and The_Donald. Please don’t go to /pol/. I warned you.
  8. Cucks. Short for cuckold. Refers to a person who is not a Trump supporter, especially one who “should” be, for example a man who doesn’t support Trump, or a media figure who is giving Trump a hard time. Alternate forms: cuckservative (a conservative who doesn’t follow Trump). Origin: comes from #GamerGate
  9. SJW. Short for social justice warrior. Origin: comes from #GamerGate
  10. Nimble navigator. This is a synonym for based centipede. Origin: this Youtube video describing centipedes as nimble navigators, overlaid with Trump video captures
  11. Autists. This refers to “person who has so many great computer skills, i.e. for hunting info, they must be autistic.” Alternates: weaponized autists. Example: Weaponized autists on /pol/ compile 400+ page document exposing Crooked Shillary! SPREAD THIS LINK LIKE WILDFIRE, AND THEN KEEP SPREADING IT!
  12. The best. “We have the best ____ don’t we folks?” Fill in with any noun. Comes from “I have the best words” and other Trumpist use of “the best”. Example: We have the best autists, don’t we folks?
  13. Spez. /u/spez is the username of the CEO of Reddit. Roundly hated and criticized by The_Donald community. To use this in a sentence, they may claim he is a  – vocabulary test incoming – SJW cuck whose company is set up to shill for Hillary using money from CTR. This revulsion and hatred for Spez got turned up to 11 over the Thanksgiving weekend when Spez revealed that he had abused his power as CEO of Reddit to edit the comments of some The_Donald members.
  14. High energy – very high praise. Opposite of “low energy” (as in Low Energy Jeb, Trump’s nickname for Jeb Bush during the primaries).
  15. White wolf and Silver Fox. These are nicknames for Mike Pence. Example: Let’s take a moment to give our thanks to Mike Pence, the White Wolf. It wasn’t too hard to annihilate the Creepy Kaine in the debates, but it was crucial nonetheless
  16. MAGA. Perhaps obvious, but stands for Make America Great Again. This is a greeting and a goodbye, kind of like Aloha.
  17. Reeeeee and Pepe. Pepe is a cartoon frog that is used by Trump supporters as a leading figure in their memes. It is also used in some white nationalist communities, but did not originate as a racist figure. The sound Pepe makes in anger is “Reeeee”.
  18. Tendies. This is a tough one. Tendies is short for chicken tenders. The origin of tendies is very complex, but supporters use it to mean an immature person, such as a whiny Bernie supporter who wants free stuff and lives in his mothers basement, eating tendies and getting “good boy points” for being nice. An example would be this headline: Trump doesn’t eat mommies tendies. He eats real fried chicken from a bucket of American KFC he bought himself!
  19. MSM. The mainstream media.
  20. Red pilled. Happens to people when they find out too much about the lies told to them by “normies” (normal people) and the MSM. Origin: The Matrix – Neo takes the Red Pill and finds out that he has been serving as a human battery.

Another thing that was very interesting to me was a type of initiation ritual that was developed for former Bernie supporters to pledge their allegiance to Trump. The ritual goes like this: first, the “afterberner” publicly declares his support for Trump by posting on The_Donald. Next, as part of that post, the afterberner disavows Bernie Sanders. Finally, the afterberner is welcomed onto “The Trump Train” and given a coat (see #6 in the vocabulary list above).

Aside from initiating new members, The_Donald posters primarily spend their time generating memes, criticizing opponents, and sharing and commenting on links. Near the end of the general election process, some of the more weaponized autists (see #11 above) donated substantial time to working on Wikileaks, specifically in finding anti-Hillary evidence within the leaked Podesta emails. I was also working on the Wikileaks emails, so I noticed them a lot. Their presence on the DNCLeaks and WikiLeaks subreddits was definitely noticed and not always appreciated.

In a prior posting I compared some of the language and beliefs of participants on The_Donald to other online communities, such as free and open source software communities, some white supremacist online communities, and the alt-right media.

Similarities between some male-dominated online communities

The Guardian had a great article today that makes explicit many of the connections between the so-called “alt-right” and other predominantly male online movements/communities such as #Gamergate. I’d extend their analysis by adding two more communities: free, libre, and open source software (FLOSS) developers, and pro-Trump communities like the_donald on Reddit. Like Gamergate and alt-right, these are male online communities that have the same predictable speaking style and culture as referenced in the Guardian article:

Prominent supporters on Twitter, in subreddits and on forums like 8Chan, developed a range of pernicious rhetorical devices and defences to distance themselves from threats to women and minorities in the industry: the targets were lying or exaggerating, they were too precious; a language of dismissal and belittlement was formed against them. Safe spaces, snowflakes, unicorns, cry bullies…. These techniques, forged in Gamergate, have become the standard toolset of far-right voices online.

I’ve built data sets of insults, gender stereotypes, double entendres (e.g. “that’s what she said” jokes) and so on using the 90+% male FLOSS developer community, and earlier this year I worked with a student and another colleague to build a machine learning classifier that could automatically detect the abusive speech style of Linus Torvalds as compared to other Linux maintainers.

After doing this work, I think there are a few other characteristics in common between all these communities:

  • Their members engage in hero worship of strong, identifiable male leaders. Moreover, these hero-leaders are always excused when they behave badly.
    • Trump in the_donald,
    • Torvalds in FLOSS,
    • Robert E. Lee/Stonewall Jackson and a large cast of other generals and ancestors to worship in neo-confederate communities,
    • even Cernovich in the allegedly “leaderless” Gamergate
  • Their members believe that meritocracy is the ideal arbiter, and that they have been victimized in the past by non-meritocratic institutions or systems. Examples of this are replete in discussions of:
    • affirmative action,
    • the obsession with illegal immigration,
    • jobs lost to foreign nations,
    • software projects that are in jeopardy because standards weren’t high enough,
    • strivers” as a euphemism for white supremacy, etc.
  • Their members fetishize their hero as a David up against some Goliath

So yes, Guardian, it is all very predictable.

Wikileaks releases 60k emails from security company HBGary

Today Wikileaks released a searchable interface for 60k HB Gary emails. HBGary is infamous for claiming that it had developed social media-based techniques that allowed it to track down members of Anonymous back in 2011, and for collaborating with the federal government to discredit certain liberal groups, including unions. The earliest are from 2008, the latest from 2011.

An example from the dump:

Wikileaks HBGary email 2574
Wikileaks HBGary email 2574