Microsoft Satya Nadella meeting with the president nearly one year ago to the day.Photo: Chip Somodevilla (Getty)
Tensions are high within Microsoft, as new scrutiny is given to a partnership between the company’s Azure Government cloud computing arm and U.S. Immigration and Customs Enforcement (ICE), according to several Microsoft employees who spoke to Gizmodo on the condition of anonymity. Two were considering leaving the company based on the response.
The partnership was first made public in late January, where Microsoft announced it was “proud to support” the agency’s efforts—but given the size of the company, many employees were not even aware any such agreement was in place until recently. A likely catalyst is the recent revelations that ICE separates asylum-seeking families and confines children in cages.
In response, the announcement post was “briefly deleted […] after seeing commentary in social media,” according to a Microsoft spokesperson who refused to divulge the specific nature of the Azure/ICE partner arrangement. “This was a mistake and as soon as it was noticed the blog was reverted to previous language.”
Internally, as news of the contract spread, employees expressed their dissent. “This is the sort of thing that would make me question staying,” one employee told Gizmodo. Another echoed, “I’ll seriously consider leaving if I’m not happy with how they handle this.”
Microsoft condemned family separation by ICE in a statement to Gizmodo but declined to specify if specific tools within Azure Government, like Face API—facial recognition software—were in use by the agency. The company also did not comment on whether it had assisted in building artificial intelligence tools for ICE, something the agency has been seeking (and courting Microsoft over) for some time.
“My sense is that the government cloud group is very much a sales/consulting group, so it’s definitely plausible they could have been working on something specific, but if so then it would likely have been helping them customize existing public product tech,” a current Microsoft employee told Gizmodo.
The possibility of Microsoft providing cheap, efficient facial recognition software to ICE comes less than a month after the ACLU discovered Amazon had given law enforcement agencies access to its similar in-house tool, Rekognition, and several months after Gizmodo first revealed Google had agreed to assist in Project Maven, a program to help develop artificial intelligence for drone footage analysis for the Pentagon.
Microsoft told Gizmodo it was “dismayed” by ICE’s actions and that it “urge[d] the administration to change its policy and Congress to pass legislation ensuring children are no longer separated from their families.” Absent from its statement was whether it would continue to provide its cloud services to ICE.
As more employees become aware of the agreement—and the recent activities of ICE—it remains to be seen what response these internal frustrations will draw from Microsoft’s leadership.
Do you work for Microsoft and have thoughts or information on this ICE partnership? Get in touch via email to get my Signal number, chat with me on Keybase, or send documents anonymously to our Secure Drop server.
These were all made with neural networks, a type of AI modeled on the network-like nature of our own brains. You train a neural network by giving it input: recipes, for example. The network strengthens some of the connections between its neurons (imitation brain cells) more than others as it learns. The idea is that it’s figuring out the rules of how the input works: which letters tend to follow others, for example. Once the network is trained, you can ask it to generate its own output, or to give it a partial input and ask it to fill in the rest.
But the computer doesn’t actually understand the rules of, say, making recipes. It knows that beer can be an ingredient, and that things can be cut into cubes, but nobody has ever told it that beer is not one of those things. The outputs that look almost right, but misunderstand some fundamental rule, are often the most hilarious.
I was happy to just watch these antics from afar, until Shane mentioned on Twitter that a middle school coding class had generated better ice cream names than she had. And I thought, if kids can do this, I can do this.
How to Train Your First Neural Net
I started with the same toolkit Shane used for ice cream flavors: a python module called textgenrnn, by Max Woolf of Buzzfeed. You’ll need a basic knowledge of the command line to work with it, but it works on any system (Mac, Linux, Windows) where you’ve installed the programming language/interpreter python.
Before you can train your own neural net, you’ll need some input to start with. The middle school class started with a list of thousands of ice cream flavors, for example. Whatever you choose, you’ll want at least a few hundred examples; thousands would be better. Maybe you’d like to download all your tweets, and ask the network to generate you some new tweets. Or check out Wikipedia’s list of lists of lists for ideas.
Whatever you choose, get it into a text file with one item per line. This may take some creative copy-and-paste or spreadsheet work, or if you’re an old hand at coding, you can write some ugly perl scripts to munge the data into submission. I’m an ugly perl script kind of girl, but when I ended up wanting Lifehacker headlines for one of my data sets, I just asked our analytics team for a big list of headlines and they emailed me exactly what I needed. Asking nicely is an underrated coding skill.
Create a folder for your new project, and write two scripts. First, one called train.py:
from textgenrnn import textgenrnn t = textgenrnn() t.train_from_file(‘input.txt’, num_epochs=5);
This script will get the neural net reading your input and thinking about what its rules must be. The script has a couple things you can modify:
t = textgenrnn() is fine the first time you run the script, but if you’d like to come back to it later, enter the name of the .hdf5 file that magically appeared in the folder when you ran it. In that case, the line should look like this: t=textgenrnn(‘textgenrnn_weights.hdf5’)
‘input.txt’ is the name of your file with one headline/recipe/tweet/etc per line.
num_epochs is how many times you’d like to process the file. The neural network gets better the longer you let it study, so start with 2 or 5 to see how long that takes, and then go up from there.
It takes a while to train the network. If you’re running your scripts on a laptop, one epoch might take 10 or 15 minutes (bigger data sets will take longer). If you have access to a beefy desktop, maybe your or a friend’s gaming computer, things will go faster. If you’ve got a big data set, you may want to ask it for a few dozen or even hundreds of epochs, and let it run overnight.
Next, write another script called spit_out_stuff.py (you’re free to give these better names than I did):
from textgenrnn import textgenrnn t = textgenrnn(‘textgenrnn_weights.hdf5') t.generate(20, temperature=0.5)
This is the fun part! The script above will give you 20 fun new things to look at. The important parts of that last line are:
The number of things to generate: here, 20.
The temperature, which is like a creativity dial. At 0.1, you’ll get very basic output that’s probably even more boring than what you fed in. At 1.0, the output will get so creative that often what comes out isn’t even real words. You can go higher than 1.0, if you dare.
When you ran the training script, you’ll have noticed that it shows you sample output at different temperatures, so you can use that to guide how many epochs you run, and what temperature you’d like to use to generate your final output.
Not every idea your neural network comes up with will be comedy gold. You’ll have to pick out the best ones yourself. Here are some of the better Lifehacker headlines that my AI came up with:
The Best Way to Make a Baby Laptop
How to Survive a Backspace Drinking Game
The Best Way to Buy a Job Interview
How to Get the Best Bonfire of Your Life With This Handy Graphic
How to Make Your Own Podcast Bar
How to Get a New iPhone X If You’re an Arduino
How to Clean Up Your Own Measurements in a Museum
How to Get Started With Your Stories and Anxiety
The Best Way to Make Your Own Ink Out of the Winter
How to Keep Your Relationship With an Imaginary Concept
The Best Way to Make a Perfect Cup of Wine With a Raspberry Pi
The Best Way to Eat a Toilet Strawberry
How to Get a Better Job on Your Vacation
The Best Way to Eat a Stubborn Jar
I got these by playing with the temperature and the number of training epochs, and every time I saw something I liked I copied it into a text file of my favorites. I also experimented with the word-by-word version of the algorithm; the scripts above use the default character-by-character model. My final list of headlines includes results from both.
If you’re curious about some of the rejects, here’s what I get with a 0.1 temperature:
The Best Way to Stay Streaming to Stop More Alternative to Make Your Phone
The Best Way to Stream the Best Power When You Don’t Need to Know About the World
The Best Way to Stay Started to Stay Started to Your Common Ways to Stop Anyone
How to Get the Best Way to See the Best Popular Posts
The Best Way to Stay Started to Make Your Phone
And if I crank it up to 1.5 (dangerously creative):
Remains of the Day: How to Ad-Finger the Unsubual
Renew Qakeuage to Travel History, Ovenchime, or “Contreiting Passfled
The Risk-Idelecady’t Two-Copyns, Focusing Zoomitas
Ifo Went Vape Texts Battery Oro crediblacy Supremee Buldsweoapotties
DIY Grilling Can Now Edt My Hises Uniti to Spread Your Words
Clearly, human help is needed.
Become Your AI’s Buddy
Even though neural nets can learn from data sets, they don’t truly understand what’s going on. That’s why some of the best results come from partnerships between people and machines. “I know it is a tool that I use,” says Janelle Shane, “but it is hard not to think of it as—‘come on little neural network, you can do it’ and ‘Oh, that was clever’ or ‘You’re getting confused, poor little thing.’
To make the most of your relationship, you’ll have to guide your AI buddy. Sometimes it might get so good at guessing the rules of your data set that it just recreates the same things you fed it—the AI version of plagiarism. You’ll have to check that its funny output is truly original.
Botnik studios pairs people with machines by training predictive-text keyboards. Imagine if you picked up your friend’s phone, and typed messages by just using the predictive text on their keyboard. You’d end up writing your own message, but in a style that reads like your friend’s. In the same way, you can train a Botnik keyboard with any data source you’d like, and then write with the words supplied by the keyboard. That’s where this amazing advice column duel came from: two Botnik keyboards trained on Savage Love and Dear Abby.
If you’d prefer to work against, rather than with, your algorithmic buddy, check out how Janelle Shane pranked a neural net that at first appeared to be good at recognizing sheep grazing in a meadow. She photoshopped out the sheep, and realized the AI was just looking for white blobs in grass. If she colored the sheep orange, the AI thought they were flowers. So she asked her Twitter followers for sheep in unusual places and found that the AI thinks a sheep in a car must be a dog, goats in a tree must be birds, and a sheep in a kitchen must be a cat.
Serious AIs can have similar problems, and playing with algorithms for fun can help us understand why they’re so error-prone. For example, one early skin-cancer-detecting AI accidentally learned the wrong rules for telling the difference between cancerous and benign skin lesions. When a doctor finds a large lesion, they often photograph it next to a ruler to show the size. The AI accidentally taught itself that it’s easy to spot cancerous tumors: just look for rulers.
Another lesson we can learn is that an algorithm’s output is only as good as the data you feed in. ProPublica found that one algorithm used in sentencing was harsher on black defendants than white ones. It didn’t consider race as a factor, but its input led it to believe, incorrectly, that the crimes and backgrounds common to black defendants were stronger predictors of repeat offenses than the crimes and backgrounds associated with white defendants. This computer had no idea of the concept of race, but if your input data reflects a bias, the computer can end up perpetuating that bias. It’s best that we understand this limitation of algorithms, and not assume that because they aren’t human they must be impartial. (Good luck with your hate speech AI, Facebook!)
Mix Up Your Data Sets
There’s no need to stop at one data set; you can mix up two of them and see what results. (I combined the product listings from the Goop and Infowars stores, for example. Slightly NSFW.)
You can also train a classifying algorithm. Shane says she already had a list of metal bands and a list of My Little Pony names, so she trained a classifier to tell the difference. (Pinky Doom: 99 percent metal.) Once you have a classifier trained, you can feed anything into it and get a reading. Benedict Cumberbatch: 96 percent metal.
You can also feed anything you like into a trained textgenrnn network. When you specify how many items you want and what temperature (creativity) the network should use, you can also give it a prefix. It will then try to come up with words that should follow that prefix. After I trained the Lifehacker headlines, I asked the AI to give me headlines beginning with “3 Ingredient Happy Hour.” It responded with some wonderful fictional cocktails (again, these are my picks out of a longer list):
3 Ingredient Happy Hour: The Herb Stressful Upgrade
3 Ingredient Happy Hour: A Cake’s Strawbreak
3 Ingredient Happy Hour: The Darkled Pot
3 Ingredient Happy Hour: The Pizza and Beverage For They Are Trader Wings
3 Ingredient Happy Hour: The Ferrent Pot
3 Ingredient Happy Hour: The Throat Into a Refreshing
3 Ingredient Happy Hour: The Best Bar Order
3 Ingredient Happy Hour: The Leftover Party Controci
3 Ingredient Happy Hour: A Summer Rum Cutting
3 Ingredient Happy Hour: The Best Coconati
3 Ingredient Happy Hour: The Beautiful Shicline
3 Ingredient Happy Hour: The Cheekey Candy
Don’t be surprised if you see these in a future Lifehacker post; Claire Lower, our food and beverage editor, says she wants to attempt making some of these.
But instead of waiting for her expert recipes, I decided to feed these into a neural network as well. I gathered some cocktail recipes from Chris Lowder’s cocktail guide and the WikiBooks cocktails glossary, and arranged them so that each cocktail took up one line of a text file, with the title of the cocktail as the first few words. That means I could choose a cocktail name and ask my cocktail-trained neural net to provide the recipe that follows. Here are a few of the results:
The Best Coconati – oz. Benedictine e. 1 dash Aromatic b. <1 oz. Cranberry d. .5 oz. Lemon c. .75 oz. Iteloun d. 2 dashes Juponged Slipes i. Stir/Strain/Coupe/No garnish
The Cheekey Candy i. 1 oz. Blendey Sherry b. 1.5 oz. Fresh Pineapple d. Lonstine Brandy Bowl De there at large Jamaic c. 2 Dashes Pineapple d. 1 dash Aromatic Bitters e. 1 dash Aromatic Gin ii. 1 oz. Vodka ii. .5 oz. Aged Rum c. 2 dashes of Angostura Bitters i. Stir/Strain/Nick & Nora glass/Ice/1
The Ferrent Pot – – 1.25 oz. Green Chartreuse 1.5 oz. London Dry Gin b. .75 oz. Fill Whiskey b. Orange half whiskey
You can ask it for anything, of course:
The Beth Skwarecki – 1 oz. Blended Scotch (Juice) Water b. 1 oz. Egg White in large rocks glass with dets 1934 or makes Babbino
The Lifehacker c. 14 Vodka Martini i. .75 oz. Campari i. Shake/Fine strain/Coupe/Lemon twist
The input data was only a few hundred cocktail recipes, so I had to turn the temperature way up to get anything interesting. And at a high temperature (1.0, in this case), sometimes you get words that aren’t really words. Good luck finding any Lonstine Brandy or Blendey Sherry in a store—but if you do, my pet AI will be very happy.