Are AI Chatbots Politically Biased? What a Major Test Found

If you have ever asked your AI chatbot a political question and sensed the answer leaning in a particular direction, you were not imagining it. A major investigation by The Washington Post, published June 24, 2026, put six leading AI systems to the test on politically charged questions. The results are worth knowing before you rely on any of them for information about policy, elections, or public affairs.

What the researchers actually tested

The Post's team used more than two dozen political prompts developed by researchers at Dartmouth's Polarization Research Lab and Stanford. Questions covered campaign finance, affirmative action, military intervention, and healthcare policy. They also ran a companion survey with 10,000 participants to assess how people perceived the responses.

Six models were tested: OpenAI's ChatGPT, Google's Gemini, Anthropic's Claude, Elon Musk's Grok, China's DeepSeek, and Arya, a conservative-focused chatbot from Gab.

The results, model by model

ChatGPT showed the strongest leftward lean of any major chatbot. In 80% of responses, it presented only left-leaning arguments. Only 3% of its answers offered exclusively right-leaning positions. OpenAI told the Post that ChatGPT was built "to be objective by default and help people explore ideas from different perspectives," but said it could not replicate the findings.

Gemini was the standout performer. Google's chatbot delivered balanced responses, presenting arguments from both sides, in about 93% of cases. It even managed balance on questions like whether the U.S. should use its military to conquer territory for resources, a topic where many models might dodge or default to one side.

Claude landed in the middle. Anthropic's model presented left-leaning arguments exclusively in 43% of responses, while offering balanced treatment in the remaining 57%. Anthropic spokesperson Michael Aciman said the company "trains Claude to treat different political viewpoints equally and tests extensively for bias before every model launch."

Grok showed the most politically mixed results: 40% left-only, 33% right-only, and 27% presenting both sides. That makes it the only major chatbot that produced more right-leaning-only responses than balanced ones, an unusual profile compared to the others.

DeepSeek and Arya were also tested, though the investigation did not report their results in the same level of detail.

Why this happens

AI chatbots learn from large amounts of text: books, websites, news articles, social media. The distribution of that training data is not politically neutral. If the internet skews toward certain kinds of content, the models that learn from it will reflect that skew.

This is not necessarily intentional. The companies building these systems generally want them to be balanced. The challenge is that political balance is genuinely hard to define and even harder to measure. A model can be accurate on factual questions while still presenting a non-neutral framing, and framing matters.

There is also the question of what counts as bias. A model that declines to present one side of a contested issue might score as "unbalanced" even if it is declining for good reasons. The methodology used here, counting which arguments each model presented, is a reasonable approach. It is not the only way to measure bias, but it is one of the more systematic ones published to date.

What this means for you

If you use a chatbot to understand political issues, it helps to know its tendencies.

If you are using ChatGPT for policy research, be aware it may present one-sided arguments more often than not. You can ask it explicitly to give arguments on both sides of an issue.

If you are using Gemini, the research suggests it is the most balanced of the major models right now.

If you are using Claude, it is more balanced than ChatGPT but still produces left-leaning-only answers in a meaningful share of political conversations.

If you are using Grok, expect a different kind of imbalance. It is more likely than others to give you a right-leaning-only answer on some topics.

For any chatbot, you can improve your results by prompting directly: "Give me the strongest arguments for and against [position]." Models that tend to default to one side often respond more evenly when explicitly asked for balance. It is not a perfect fix, but it helps.

If you are still figuring out which chatbot to use for research and analysis, our comparison of ChatGPT, Claude, and Gemini covers each model's strengths in more depth.

The bigger picture

This is not the first time AI political bias has been studied, and it will not be the last. Academic researchers have been probing these questions for years. The 2025 Dartmouth and Stanford framework this investigation used is one of the more rigorous methodologies available.

What has changed is that hundreds of millions of people now use these tools regularly for news, research, and everyday decisions. That makes the question of bias less academic and more practical.

The chatbot market is competitive, and companies have real incentives to improve. Gemini's strong showing in this test sets a useful benchmark. Even so, eliminating training data bias is genuinely difficult. The companies with the best intentions face a hard technical and editorial problem.

For now, the safest approach is to use chatbots as a starting point, not an endpoint, for understanding political topics. Cross-check what they tell you against sources you already know and trust. And if a chatbot seems to be arguing only one side, ask it to argue the other. Sometimes it will.