NYC 144 is a New York City ordinance regulating the use of AI by companies focused on hiring/retention/promotion bias and discrimination. The law requires companies using AI in any tool involved in hiring to annually audit those tools and publish those audit results. The law applies to chatbot tools and resume parsers intended to do keyword matching and other functions to chop up resume content into database tables for easier employer searching and analysis. Some of those reports have already been published as the law has been in effect for six months.
NLP What About Me?
NLP stands for Natural Language Processing. It’s a method of analyzing language to organize it beyond mere words into discreet tokens (have you heard that word in the LLM world? Answer: Yes) entities, geographic locations and a slew of other analyses.
So 1950
NLP began in the 1950s as an attempt by computer scientists to teach machines how humans create, and process language. It struggled not because the statistics or analysis or even mathematical models were poor. It struggled because computing was not powerful enough to do the work needed to test the theories. Then, by the 1980s, probabilistic models and rule-based systems were developed and NLP began to take off. Python is the dominant programming language today for those working in NLP. The point of all of this is to say, uh, resume parsing, keyword matching and even semantic similarity processes predate ChatGPT by about 40 years or so. The New York City law then is attacking a problem that has always existed…AI is just the star that brought it to everyone’s attention outside of programming nerds.
That is not to say that AI powered tools should be yawned at. Not at all. It is merely to say that the problem exists, it will always exist (because bias will always exist) and no AI tool is going to be able to eliminate bias entirely.
So far, the NYC ordinance is one of public disclosure. No fines, injunctions or other remedies are proposed to those who have ___% of bias reflected in their annual reports.
The Devil, You Know Where He Is
The law requires the publication of “adverse impact ratios” purporting to show whether the use of the AI powered tool has a disparate impact on a particular race or gender. There are a bunch of chickens and eggs here shoving to be first in line. If a given software platform is applied to hiring decisions and it turns out more of X group gets hired versus Y group, from a computer science standpoint, pinning that on AI is “riddle wrapped in a mystery inside an enigma.”
A few examples to highlight the murky problem. Who controls what words or phrases an applicant uses on her resume? She does. Who controls what words or phrases an employer relies on to identify and match an applicant to a list of required job skills? The employer. Given that an applicant and employer have, by virtue of one seeking a job from the other, never met the odds of them doing some mind reading to get that match correct are, well, psychically unlikely. Assume an organization posts their results and their hiring decisions were 75% X group, but only 25% Y group. Is that ratio indicative of a problem? Maybe. Is it indicative of bias in the tool used? Maybe. Is it indicative of the AI part of that tool being biased? Maybe. We are now three layers in and it should be apparent that the solution to the riddle involves, well, what does it involve?
Predicting is, well, unpredictable
AI is not some magic potion. It is sophisticated and well known mathematical/statistical approaches to predicting what might happen next. People are terrible at predictions. Therefore, non-AI hiring/retention/promotion decisions are likely to be, well, terrible. No universe exists today where an AI prediction tool will have __% bias, but tossing it out and using pure human prediction will have 50% less bias than AI. Not going to happen. And, if AI is to be used, and it already is all around you and will continue to be, and some adverse impact ratio occurs beyond a regulated threshold, humans will re-enter that process with, you guessed it, their inability to predict and, wait for it….biases.
The question might be better asked and answered this way. Did the AI powered tool have a disparate impact on X group despite the fact that most members of X group were as qualified as those in Y group chosen for hiring/retention /promotion?
I have often told non-lawyer types, one proxy for whether there is widespread discrimination in society, for example in hiring/retention/promotion is the number of lawsuits making such claims that are settling or prevailing at trial. Have you casually noticed at your courthouse, CLE program slates, or even the local publicized dockets whether these types of cases are on the rise? Having build web scrapers to collect publicly available data of that type, I can tell you that since 2009 in my county in Ohio, those types of cases are rarely filed and rarely tried. We all know that if there was money to be made in such lawsuits, because there was provable discrimination in hiring/retention/promotion happening, those cases are going to be filed and aggressively litigated. And, yet, there has to be calculable adverse impact ratios right now in NYC and elsewhere.
The question is not really whether there is a disparate impact on group X or Y. The question is, can it be traced to an AI powered tool? Will the publication of these ratios and other data lead to….what exactly? It will lead to embarrassment to some organization somewhere who ends up having the highest such ratio.
The law also requires employers to offer applicants an alternative means of applying for jobs that evades the use of the AI powered tool. This poses an interesting sociological/cultural experiment. Imagine an organization having 80% of their applicants exercise their rights under this provision and not having the AI powered tool applied to their application at all. That would be a powerful statement about the concern the public has about the use of these tools, these black boxes that many outside of computer science view with glassy-eyed concern, but no real understanding of what goes on behind the curtain.
It’s The (Training) Data, People
Concern over algorithmic bias is not new. About ten years ago, Amazon used a recruiting algorithm to analyze 10 years of software engineer applications. The tool searched for current candidates with similar backgrounds. Well, can you guess the demographic problem with using ten years of resumes from 2000-2010? Because most of those resumes were from men, the “learning” part of the algorithm quickly determined that resumes from females ought to be downgraded. But, this is not an algorithm problem at all. The algorithm worked as expected…based upon its training data. If I were to teach a computer vision tool to identify a squirrel by showing it 10000 images of squirrels from southern Ohio, it is going to fail to properly identify grey squirrels, flying squirrels, etc. Why, because most squirrels in southern Ohio are brown or orange-y in color. That is not the fault of the algorithm. It is an operator error not providing the tool the full range of existing squirrel types.
Who Said What So Far?
Several large companies headquartered in NYC posted their data recently. Those companies include Pfizer, Morgan Stanley and Cigna. While the law requires that the results be posted in the careers section of the respective websites, some researchers struggled to find the data. Many concluded that the existence of the data, often obscured for a time from diligent investigators, would prove of little use to the typical job applicant. It brings to mind the philosophy in the pre-electronic data of litigation when discovery responses would include 80,000 documents with the one or two that really matter buried somewhere in there. Obscurity by adversity…in finding the information you really want.
Human In the Loop, Sometimes Messes Up The Loop
A potentially significant part of the analysis so far includes the reality that applicants are not required to disclose their membership in any particular group when applying. Although employers seek to collect this information for internal use and as part of regulatory requirements, for applicants, it is optional. Without that information, determining whether a given tool, AI powered or not, is having an unlawful disparate impact on a group seems nearly impossible. Many applicants are understandably hesitant to disclose such information without, yes, the predictability as to whether it will help or hinder their application. How is one to know where their application will land if they disclose some racial, gender or other group membership? For regulators, how will the know if that disclosure was the key factor of that applicant’s application resulting in a hiring decision either way?
Yes, this is a start. But, the bias and discrimination issues that existed before 10 minutes ago will persist to whatever degree as in the past. It seems unlikely that an AI tool will impose more bias than naturally exists because, well, we are humans assessing other humans. As mentioned above, the most useful outcome of the NYC ordinance might well be the percentage of applicants who opt-out of AI powered analysis/filtering/decision making about their application entirely. I am going to predict…ha, not at all
.