ChatGPT Fairness Bias

Hey friends! Welcome to the development of the AI world. Today's top AI news highlights OpenAI’s finding of ChatGPT fairness bias, Anthropic’s updates of AI safety policy, and Meta’s thought-driven LLMS. Additionally, meet Augmented Physics, a machine learning-powered tool designed to create interactive physics simulations from static textbook diagrams. Let’s dive in—enjoy this AI ride in just 4 minutes!

The AI World Today

  • OpenAI Studies ChatGPT Fairness Bias

  • Anthropic Updates AI Safety Policy

  • Meta Introduces Thought-Driven LLMs

    +

  • Heads Up

  • AI Solution

OpenAI Investigates ChatGPT Responses and Fairness Bias

Image: OpenAI

A study by OpenAI examined fairness in ChatGPT, focusing on how users' names influence responses. Researchers tested whether ChatGPT’s awareness of names associated with different genders, races, and ethnicities impacted response quality. The study found that overall response quality was consistent regardless of name, with less than 1% of responses reflecting harmful stereotypes. The research used a Language Model Research Assistant (LMRA) to analyze millions of requests while protecting privacy. It identified subtle biases in specific tasks, such as storytelling, where responses for female-sounding names more often featured female protagonists. The study highlights ongoing efforts to reduce bias, improve fairness, and promote transparency, with findings incorporated into OpenAI’s model evaluation process. The team encourages further collaboration on AI fairness research.

Anthropic Introduces Enhanced Responsible Scaling Policy

Illustration: Superintelligence AI

Anthropic has updated its Responsible Scaling Policy (RSP), a governance framework for mitigating risks from advanced AI systems. The updated RSP introduces new capability thresholds and safeguards to address potential risks, such as AI autonomously advancing research or aiding in creating Chemical, Biological, Radiological, and Nuclear (CBRN) weapons. The policy uses AI Safety Level Standards (ASL) for proportional protection, with ASL-3 and ASL-4 requiring enhanced safeguards. Anthropic aims to improve internal governance and solicit external expert feedback. Co-Founder Jared Kaplan will now serve as Responsible Scaling Officer, overseeing the policy's implementation. The company hopes this policy will inspire best practices for AI risk management across the industry. Anthropic is also hiring for key roles supporting RSP implementation.

Meta Develops AI Models That Think Before Responding

Illustration: Superintelligence AI

Meta researchers introduced Thought Preference Optimization (TPO), a new method enabling large language models (LLMs) to "think" before responding to general instructions, not just reasoning tasks. TPO prompts LLMs to generate internal thoughts before producing a final response, optimizing outputs through trial-and-error without additional human data or supervision. The internal thoughts are private, and only the final answer is shown to users. TPO has shown superior performance on benchmarks like AlpacaEval and Arena-Hard, especially in non-reasoning tasks like marketing, health, and general knowledge. This approach contrasts with earlier methods by allowing the model to learn thought processes independently. Meta’s method aims to make LLMs more capable across various tasks, expanding their utility beyond technical domains.

Heads Up 

The New York Times issued a cease-and-desist to AI startup Perplexity, accusing it of unauthorized use of its content for AI-generated summaries, violating copyright law. 

Google has released Gemma-APS, an open-source collection of models for text-to-propositions segmentation, distilled from fine-tuned Gemini Pro models using multi-domain synthetic data.

Arm announced its Total Design ecosystem has doubled in a year, promoting global silicon innovation, addressing AI workload demands, chip complexity, and sustainability in datacenters.

Amazon introduced a new AI-driven creative suite for advertisers, featuring tools to create video, audio, and animated image advertisements. 

At its MAX conference, Adobe unveiled Project Super Sonic, a prototype using text-to-audio and object recognition to quickly generate background audio and effects for videos.

AI Solution

Augmented Physics: Interactive Simulations for Textbooks

Augmented Physics is a machine learning-powered tool designed to create interactive physics simulations from static textbook diagrams. By leveraging advanced computer vision technologies like Segment Anything and multi-modal large language models (LLMs), this web-based tool allows users to semi-automatically extract diagrams from physics textbooks and generate simulations. The system supports various physics concepts such as optics, circuits, and kinematics, transforming traditional textbook content into interactive learning experiences. The interactive diagrams are seamlessly embedded into scanned textbook pages, enhancing personalized and hands-on learning. Augmented Physics incorporates four key augmentation strategies: augmented experiments, animated diagrams, bi-directional binding, and parameter visualization. Evaluated through technical tests, usability studies, and expert feedback, the tool significantly improves engagement and understanding in physics education, making it an essential resource for both educators and students.