As a survey designer, this post really resonated. As a person trying to understand humans, and especially the history of race definitions and origins, this resonated even more. Great post. This is also why Woke in its original form is important, as isC RT
I am really glad someone flagged your post so that I saw it (and as you spotted, I restacked it). You did a brilliant job of dissecting that stupid Times article, and into the bargain pointed out what appears to me to be a common problem in survey design. As a side issue on what data/survey results show: in the case of Mamdani, I think that the primary election results are actually quite complicated. I am too often seeing what I believe to be over-broad lessons drawn from them, which, as a resident of NYC, concern me. For example, whereas it is sometimes assumed that it’s mostly the well-to-do who broke for Cuomo over Mamdani, it is often overlooked that the lowest income voters and Black voters broke for Cuomo, too. It’s also important to recognize that only under 30% of eligible voters voted in the Democratic primary. So, while there is no question in my mind but that Mamdani ran an excellent campaign, I myself am not at all sure what is fair to draw from this vote even in terms of Democratic voter concerns and preferences. I think the city itself, while overwhelmingly Democratic, is nonetheless very divided, and I worry what this portends for the general election and beyond.
Thank you, I’m glad you liked it. Election results are definitely a type of survey and are made even more complicated to understand by the fact that we do not necessarily know anything about who voted and how (except for exit polls which have their own potential information bias issues).
I would like to add: information bias, such as this, is a serious problem in epidemiology. And you cannot correct for it with large sample sizes or anything else.
Here is a real-world example with insidious consequences. The California Department of Developmental Services (DDS) provides state-mandated services to all residents who qualify. They collect data on clients via forms (called CDER) completed by the regional centers. Those forms are inherently defective in a way that gives the world wrong information about autism.
Specifically, the form requires an answer as to whether or not a client (e.g., with autism) has intellectual disability (ID). There is no way to indicate that the center didn't evaluate the autistic client for ID. The centers are not required to evaluate clients for ID once they have determined eligibility under autism, and they have no incentive to do so. But they have no way to say that they don't know, even though they don't. So they check the box that says "No ID" even when it's not true.
The obvious result is that the vast majority of California DDS CDER forms for autistic clients falsely show that they don't have intellectual disability. Actually, nobody knows the accurate numbers and there is no reason to believe the conclusion is correct.
There's a famous paper in the Proceedings of the National Academy of Sciences (PNAS) that claims, based on these forms, that ID is rare in autism. That is clearly not correct, for many reasons. Another researcher I know wrote to the PNAS editor about this problem, and they refused to even publish a letter about it.
This is a good example of science based on corrupted data influencing the public to believe something that is not true.
Yes, this information bias can’t easily be corrected for, except via very assumption-heavy methods and extremely intensively collected validation-data. This is why it’s always so important to look at how the data were collected!
I think about this a lot as well when I reflect on the now-taboo DEI initiatives that academic medical centers put in place. Many are in cities that have a large population of multi-generational Black Americans born in the US, but the Black faculty members often are not. Not necessarily a bad thing, but in a world where the solution of representativeness involved recruiting both foreign-born and US-born Black adults, transparency and more options are probably a good thing!
Yes, this is an issue I’m familiar with and it’s definitely not one that is in any way helped by only collecting very general categories. If we really want equality, having detailed data helps!
As a survey designer, this post really resonated. As a person trying to understand humans, and especially the history of race definitions and origins, this resonated even more. Great post. This is also why Woke in its original form is important, as isC RT
Thanks! I agree.
I am really glad someone flagged your post so that I saw it (and as you spotted, I restacked it). You did a brilliant job of dissecting that stupid Times article, and into the bargain pointed out what appears to me to be a common problem in survey design. As a side issue on what data/survey results show: in the case of Mamdani, I think that the primary election results are actually quite complicated. I am too often seeing what I believe to be over-broad lessons drawn from them, which, as a resident of NYC, concern me. For example, whereas it is sometimes assumed that it’s mostly the well-to-do who broke for Cuomo over Mamdani, it is often overlooked that the lowest income voters and Black voters broke for Cuomo, too. It’s also important to recognize that only under 30% of eligible voters voted in the Democratic primary. So, while there is no question in my mind but that Mamdani ran an excellent campaign, I myself am not at all sure what is fair to draw from this vote even in terms of Democratic voter concerns and preferences. I think the city itself, while overwhelmingly Democratic, is nonetheless very divided, and I worry what this portends for the general election and beyond.
Thank you, I’m glad you liked it. Election results are definitely a type of survey and are made even more complicated to understand by the fact that we do not necessarily know anything about who voted and how (except for exit polls which have their own potential information bias issues).
Thank you, Ellie. Excellent post.
I would like to add: information bias, such as this, is a serious problem in epidemiology. And you cannot correct for it with large sample sizes or anything else.
Here is a real-world example with insidious consequences. The California Department of Developmental Services (DDS) provides state-mandated services to all residents who qualify. They collect data on clients via forms (called CDER) completed by the regional centers. Those forms are inherently defective in a way that gives the world wrong information about autism.
Specifically, the form requires an answer as to whether or not a client (e.g., with autism) has intellectual disability (ID). There is no way to indicate that the center didn't evaluate the autistic client for ID. The centers are not required to evaluate clients for ID once they have determined eligibility under autism, and they have no incentive to do so. But they have no way to say that they don't know, even though they don't. So they check the box that says "No ID" even when it's not true.
The obvious result is that the vast majority of California DDS CDER forms for autistic clients falsely show that they don't have intellectual disability. Actually, nobody knows the accurate numbers and there is no reason to believe the conclusion is correct.
There's a famous paper in the Proceedings of the National Academy of Sciences (PNAS) that claims, based on these forms, that ID is rare in autism. That is clearly not correct, for many reasons. Another researcher I know wrote to the PNAS editor about this problem, and they refused to even publish a letter about it.
This is a good example of science based on corrupted data influencing the public to believe something that is not true.
Ref: www.pnas.org/cgi/doi/10.1073/pnas.2015762117
Yes, this information bias can’t easily be corrected for, except via very assumption-heavy methods and extremely intensively collected validation-data. This is why it’s always so important to look at how the data were collected!
I think about this a lot as well when I reflect on the now-taboo DEI initiatives that academic medical centers put in place. Many are in cities that have a large population of multi-generational Black Americans born in the US, but the Black faculty members often are not. Not necessarily a bad thing, but in a world where the solution of representativeness involved recruiting both foreign-born and US-born Black adults, transparency and more options are probably a good thing!
Yes, this is an issue I’m familiar with and it’s definitely not one that is in any way helped by only collecting very general categories. If we really want equality, having detailed data helps!