AI News
Mapping the Misuse of Generative AI
New research analyzes the misuse of multimodal generative AI today, in order to help build safer and more responsible technologies
Generative artificial intelligence (AI) models that can produce image, text, audio, video and more are enabling a new era of creativity and commercial opportunity. Yet, as these capabilities grow, so does the potential for their misuse, including manipulation, fraud, bullying or harassment.
As part of our commitment to develop and use AI responsibly, we published a new paper, in partnership with Jigsaw and Google.org, analyzing how generative AI technologies are being misused today. Teams across Google are using this and other research to develop better safeguards for our generative AI technologies, amongst other safety initiatives.
Together, we gathered and analyzed nearly 200 media reports capturing public incidents of misuse, published between January 2023 and March 2024. From these reports, we defined and categorized common tactics for misusing generative AI and found novel patterns in how these technologies are being exploited or compromised.
By clarifying the current threats and tactics used across different types of generative AI outputs, our work can help shape AI governance and guide companies like Google and others building AI technologies in developing more comprehensive safety evaluations and mitigation strategies.
Highlighting the main categories of misuse
While generative AI tools represent a unique and compelling means to enhance creativity, the ability to produce bespoke, realistic content has the potential to be used in inappropriate ways by malicious actors.
By analyzing media reports, we identified two main categories of generative AI misuse tactics: the exploitation of generative AI capabilities and the compromise of generative AI systems. Examples of the technologies being exploited included creating realistic depictions of human likenesses to impersonate public figures; while instances of the technologies being compromised included ‘jailbreaking’ to remove model safeguards and using adversarial inputs to cause malfunctions.
Relative frequency generative AI misuse tactics in our dataset. Any given case of misuse reported in the media could involve one or more tactics.
Cases of exploitation — involving malicious actors exploiting easily accessible, consumer-level generative AI tools, often in ways that didn’t require advanced technical skills — were the most prevalent in our dataset. For example, we reviewed a high-profile case from February 2024 where an international company reportedly lost HK$200 million (approx. US $26M) after an employee was tricked into making a financial transfer during an online meeting. In this instance, every other “person” in the meeting, including the company’s chief financial officer, was in fact a convincing, computer-generated imposter.
Some of the most prominent tactics we observed, such as impersonation, scams, and synthetic personas, pre-date the invention of generative AI and have long been used to influence the information ecosystem and manipulate others. But wider access to generative AI tools may alter the costs and incentives behind information manipulation, giving these age-old tactics new potency and potential, especially to those who previously lacked the technical sophistication to incorporate such tactics.
Identifying strategies and combinations of misuse
Falsifying evidence and manipulating human likenesses underlie the most prevalent tactics in real-world cases of misuse. In the time period we analyzed, most cases of generative AI misuse were deployed in efforts to influence public opinion, enable scams or fraudulent activities, or to generate profit.
By observing how bad actors combine their generative AI misuse tactics in pursuit of their various goals, we identified specific combinations of misuse and labeled these combinations as strategies.
Diagram of how the goals of bad actors (left) map onto their strategies of misuse (right).
Emerging forms of generative AI misuse, which aren’t overtly malicious, still raise ethical concerns. For example, new forms of political outreach are blurring the lines between authenticity and deception, such as government officials suddenly speaking a variety of voter-friendly languages without transparent disclosure that they’re using generative AI, and activists using the AI-generated voices of deceased victims to plead for gun reform.
While the study provides novel insights on emerging forms of misuse, it’s worth noting that this dataset is a limited sample of media reports. Media reports may prioritize sensational incidents, which in turn may skew the dataset towards particular types of misuse. Detecting or reporting cases of misuse may also be more challenging for those involved because generative AI systems are so novel. The dataset also doesn’t make a direct comparison between misuse of generative AI systems and traditional content creation and manipulation tactics, such as image editing or setting up ‘content farms’ to create large amounts of text, video, gifs, images and more. So far, anecdotal evidence suggests that traditional content manipulation tactics remain more prevalent.
Staying ahead of potential misuses
Our paper highlights opportunities to design initiatives that protect the public, such as advancing broad generative AI literacy campaigns, developing better interventions to protect the public from bad actors, or forewarning people and equipping them to spot and refute the manipulative strategies used in generative AI misuse.
This research helps our teams better safeguard our products by informing our development of safety initiatives. On YouTube, we now require creators to share when their work is meaningfully altered or synthetically generated, and seems realistic. Similarly, we updated our election advertising policies to require advertisers to disclose when their election ads include material that has been digitally altered or generated.
As we continue to expand our understanding of malicious uses of generative AI and make further technical advancements, we know it’s more important than ever to make sure our work isn’t happening in a silo. We recently joined the Content for Coalition Provenance and Authenticity (C2PA) as a steering committee member to help develop the technical standard and drive adoption of Content Credentials, which are tamper-resistant metadata that shows how content was made and edited over time.
In parallel, we’re also conducting research that advances existing red-teaming efforts, including improving best practices for testing the safety of large language models (LLMs), and developing pioneering tools to make AI-generated content easier to identify, such as SynthID, which is being integrated into a growing range of products.
In recent years, Jigsaw has conducted research with misinformation creators to understand the tools and tactics they use, developed prebunking videos to forewarn people of attempts to manipulate them, and shown that prebunking campaigns can improve misinformation resilience at scale. This work forms part of Jigsaw’s broader portfolio of information interventions to help people protect themselves online.
By proactively addressing potential misuses, we can foster responsible and ethical use of generative AI, while minimizing its risks. We hope these insights on the most common misuse tactics and strategies will help researchers, policymakers, industry trust and safety teams build safer, more responsible technologies and develop better measures to combat misuse.
Acknowledgements
This research was a collective effort by Nahema Marchal, Rachel Xu, Rasmi Elasmar, Iason Gabriel, Beth Goldberg, and William Isaac, with feedback and advisory contributions from Mikel Rodriguez, Vijay Bolina, Alexios Mantzarlis, Seliem El-Sayed, Mevan Babakar, Matt Botvinick, Canfer Akbulut, Harry Law, Sébastien Krier, Ziad Reslan, Boxi Wu, Frankie Garcia, and Jennie Brennan.
A New Generation of African Talent brings Cutting-Edge AI to Scientific challenges
Food security, healthcare and exploring the cosmos are among the ways students of a new pan-African Master’s program aspire to apply AI
At Google DeepMind, we’re committed to supporting the next generation of artificial intelligence (AI) leaders to help build a stronger, more diverse and inclusive global AI community. This includes increasing access to AI and science through education.
Last year, we partnered with the African Institute for Mathematical Sciences (AIMS), Africa’s first network of centers of excellence in mathematical sciences, to launch an AI for Science Master’s program, with a $4.5M grant from Google DeepMind.
This funding helps AIMS provide full scholarships, equipment and compute to talented local students, giving them access to advanced studies in mathematics, AI and machine learning from world-class academics at AIMS South Africa. Students have the opportunity to accelerate scientific discovery, with mentoring and support from Google DeepMind’s researchers and engineers.
This summer, the first cohort of students graduated at a ceremony at the AIMS campus in Cape Town, South Africa. As the next generation of AI leaders in Africa, Béria Chingnabé Kalpélbé, Olivier Mahumawon Adjagba and Diffo Mboudjiho Annette Dariose shared their experiences in pioneering AI research and what they’re hoping to achieve with their work.
Béria Chingnabé Kalpélbé is passionate about applying AI to sustainability challenges.
Béria: Innovating for better food security
Sustainability is a top priority for Béria, originally from Chad. “I hope to develop solutions for sustainable agricultural development that will benefit both people and the planet by integrating principles of renewable energy, precision farming, and ecological preservation in my work,” he says.
“Beyond agriculture, AI offers significant potential to enhance the resilience of Africa’s natural environments,” Béria adds. “By implementing AI-powered monitoring and decision-support systems, we can safeguard Africa’s precious green areas and biodiversity for future generations.”
Olivier Mahumawon Adjagba wants to use AI to create more accurate prediction models for the spread of dengue fever.
Olivier: Pioneering virus transmission research through the lens of climate change
Olivier’s passion for applying mathematics to complex problems led him to AIMS South Africa: “Throughout my academic journey, I’ve been fascinated by the power of mathematics, particularly in addressing real-world challenges through AI,” he says. “A solid foundation in mathematical sciences is crucial for driving progress in areas such as healthcare, climate science and technology — and I’m eager to be at the forefront of these advancements.”
Originally from Benin, Olivier now looks to apply this approach to data from African countries to help understand the spread of dengue fever. “Using advanced AI techniques, I hope to create more accurate prediction models to inform public health strategies and interventions, ultimately contributing to the control and prevention of this viral disease.”
Discussing the personal impact of his scholarship, Olivier recounts, “Without it, pursuing advanced studies at such a prestigious institution would have been financially unattainable for me. This support enabled me to fully immerse myself in AIMS’ rigorous academic environment, so I could engage deeply in coursework, collaborate with professors and peers, and contribute meaningfully to research projects.”
Diffo Mboudjiho Annette Dariose hopes to learn more about our universe with the help of AI.
Diffo: Unraveling the secrets of our universe
Diffo, from Cameroon, is fascinated by the big questions beyond Earth — which is what drew her to the Square Kilometre Array (SKA), the largest and most sensitive radio telescope on the planet.
“Understanding the 21cm line provides insights into the early universe, the formation of the first stars and galaxies, and the structure of the cosmos,” Diffo explains. “By applying Markov chain Monte Carlo (MCMC) techniques, I hope to improve the accuracy and efficiency of extracting these faint signals from SKA data, potentially leading to more precise cosmological models and a deeper understanding of the future evolution of the universe.”
For those considering similar studies, Diffo offers a few words of advice: “Stay curious, be persistent and embrace interdisciplinary learning. Engaging in hands-on projects, collaborating with peers, and seeking mentorship from AI experts can greatly benefit your learning experience and career prospects.”
Supporting AI education in Africa
This work builds on our existing commitments in the region, including our support of the Deep Learning Indaba through volunteering and funding since its inception in 2017, the recent launch of our Experience AI education program across Africa, which has already engaged local educators working with more than 30,000 young people, and additional educational funding, which has been used by three further African universities to offer a total of over 40 postgraduate scholarships since 2020.
Increasing representation in the field of AI research offers a much-needed opportunity to bring diverse values, perspectives, and concerns into conversations about the design and deployment of this transformative technology. We hope our support for AIMS not only serves to build a more global and inclusive AI ecosystem, but also helps students make new scientific discoveries that benefit their local communities and the entire globe.
Learn more
Acknowledgements
With special thanks to Ulrich Paquet, research scientist at Google DeepMind, who is serving as executive director at AIMS South Africa until 2027 and helped launch the AI for Science Masters program. Paquet continues to hold a dual affiliation with Google DeepMind.
We would also like to thank the University of Cape Town, Stellenbosch University, and the University of the Western Cape, the degree-granting institutions underpinning this program at AIMS South Africa, for their long standing academic support.
AlphaProteo Generates Novel Proteins for Biology and Health Research
Research Protein Design and Wet Lab teams
New AI system designs proteins that successfully bind to target molecules, with potential for advancing drug design, disease understanding and more.
Every biological process in the body, from cell growth to immune responses, depends on interactions between molecules called proteins. Like a key to a lock, one protein can bind to another, helping regulate critical cellular processes. Protein structure prediction tools like AlphaFold have already given us tremendous insight into how proteins interact with each other to perform their functions, but these tools cannot create new proteins to directly manipulate those interactions.
Scientists, however, can create novel proteins that successfully bind to target molecules. These binders can help researchers accelerate progress across a broad spectrum of research, including drug development, cell and tissue imaging, disease understanding and diagnosis – even crop resistance to pests. While recent machine learning approaches to protein design have made great strides, the process is still laborious and requires extensive experimental testing.
Today, we introduce AlphaProteo, our first AI system for designing novel, high-strength protein binders to serve as building blocks for biological and health research. This technology has the potential to accelerate our understanding of biological processes, and aid the discovery of new drugs, the development of biosensors and more.
AlphaProteo can generate new protein binders for diverse target proteins, including VEGF-A, which is associated with cancer and complications from diabetes. This is the first time an AI tool has been able to design a successful protein binder for VEGF-A.
AlphaProteo also achieves higher experimental success rates and 3 to 300 times better binding affinities than the best existing methods on seven target proteins we tested.
Learning the intricate ways proteins bind to each other
Protein binders that can bind tightly to a target protein are hard to design. Traditional methods are time intensive, requiring multiple rounds of extensive lab work. After the binders are created, they undergo additional experimental rounds to optimize binding affinity, so they bind tightly enough to be useful.
Trained on large amounts of protein data from the Protein Data Bank (PDB) and more than 100 million predicted structures from AlphaFold, AlphaProteo has learned the myriad ways molecules bind to each other. Given the structure of a target molecule and a set of preferred binding locations on that molecule, AlphaProteo generates a candidate protein that binds to the target at those locations.
Illustration of a predicted protein binder structure interacting with a target protein. Shown in blue is a predicted protein binder structure generated by AlphaProteo, designed for binding to a target protein. Shown in yellow is the target protein, specifically the SARS-CoV-2 spike receptor-binding domain
For one particular target, the viral protein BHRF1, 88% of our candidate molecules bound successfully when tested in the Google DeepMind Wet Lab. Based on the targets tested, AlphaProteo binders also bind 10 times more strongly, on average, than the best existing design methods.
For another target, TrkA, our binders are even stronger than the best prior designed binders to this target that have been through multiple rounds of experimental optimization.
Bar graph showing experimental in vitro success rates of AlphaProteo’s output for each of the seven target proteins, compared to other design methods. Higher success rates mean fewer designs must be tested to find successful binders.
Bar graph showing the best affinity for AlphaProteo’s designs without experimental optimization for each of the seven target proteins, compared to other design methods. Lower affinity means the binder protein binds more tightly to the target protein. Please note the logarithmic scale of the vertical axis.
Validating our results
Beyond in silico validation and testing AlphaProteo in our wet lab, we engaged Peter Cherepanov’s, Katie Bentley’s and David LV Bauer’s research groups from the Francis Crick Institute to validate our protein binders. Across different experiments, they dived deeper into some of our stronger SC2RBD and VEGF-A binders. The research groups confirmed that the binding interactions of these binders were indeed similar to what AlphaProteo had predicted. Additionally, the groups confirmed that the binders have useful biological function. For example, some of our SC2RBD binders were shown to prevent SARS-CoV-2 and some of its variants from infecting cells.
AlphaProteo’s performance indicates that it could drastically reduce the time needed for initial experiments involving protein binders for a broad range of applications. However, we know that our AI system has limitations, as it was unable to design successful binders against an 8th target, TNFɑ, a protein associated with autoimmune diseases like rheumatoid arthritis. We selected TNFɑ to robustly challenge AlphaProteo, as computational analysis showed that it would be extremely difficult to design binders against. We will continue to improve and expand AlphaProteo’s capabilities with the goal of eventually addressing such challenging targets.
Achieving strong binding is usually only the first step in designing proteins that might be useful for practical applications, and there are many more bioengineering obstacles to overcome in the research and development process.
Towards responsible development of protein design
Protein design is a fast-evolving technology that holds lots of potential for advancing science in everything from understanding the factors that cause disease, to accelerating diagnostic test development for virus outbreaks, supporting more sustainable manufacturing processes, and even cleaning contaminants from the environment.
To account for potential risks in biosecurity, building on our long-standing approach to responsibility and safety, we’re working with leading external experts to inform our phased approach to sharing this work, and feeding into community efforts to develop best practices, including the NTI’s (Nuclear Threat Initiative) new AI Bio Forum.
Going forward, we’ll be working with the scientific community to leverage AlphaProteo on impactful biology problems and understand its limitations. We’ve also been exploring its drug design applications at Isomorphic Labs, and are excited for what the future holds.
At the same time, we’re continuing to improve the success rate and affinity of AlphaProteo’s algorithms, expanding the range of design problems it can tackle, and working with researchers in machine learning, structural biology, biochemistry and other disciplines to develop a responsible and more comprehensive protein design offering for the community.
If you’re a biologist, whose research could benefit from target-specific protein binding, and you’d like to register interest in being a trusted tester for AlphaProteo, please reach out to us on alphaproteo@google.com.
We’ll process messages received according to our Privacy Policy.
Acknowledgements
This research was co-developed by our Protein Design team and Wet Lab team.
We’d like to thank our collaborators Peter Cherepanov, David Bauer, Katie Bentley and their groups at the Francis Crick Institute for their invaluable experimental insights and results, the AlphaFold team, whose earlier work and algorithms provided training inputs and evaluation insights, and the many other teams across Google DeepMind who contributed to this program.
Latest Advances in Robot Dexterity
Research | Published: 12 September 2024 |Authors: Robotics Team
Two new AI systems, ALOHA Unleashed and DemoStart, help robots learn to perform complex tasks that require dexterous movement
People perform many tasks on a daily basis, like tying shoelaces or tightening a screw. But for robots, learning these highly-dexterous tasks is incredibly difficult to get right. To make robots more useful in people’s lives, they need to get better at making contact with physical objects in dynamic environments.
Today, we introduce two new papers featuring our latest artificial intelligence (AI) advances in robot dexterity research: ALOHA Unleashed which helps robots learn to perform complex and novel two-armed manipulation tasks; and DemoStart which uses simulations to improve real-world performance on a multi-fingered robotic hand.
By helping robots learn from human demonstrations and translate images to action, these systems are paving the way for robots that can perform a wide variety of helpful tasks.
Improving imitation learning with two robotic arms
Until now, most advanced AI robots have only been able to pick up and place objects using a single arm. In our new paper, we present ALOHA Unleashed, which achieves a high level of dexterity in bi-arm manipulation. With this new method, our robot learned to tie a shoelace, hang a shirt, repair another robot, insert a gear and even clean a kitchen.
The future of robot dexterity
Robotics is a unique area of AI research that shows how well our approaches work in the real world. For example, a large language model could tell you how to tighten a bolt or tie your shoes, but even if it was embodied in a robot, it wouldn’t be able to perform those tasks itself.
One day, AI robots will help people with all kinds of tasks at home, in the workplace and more. Dexterity research, including the efficient and general learning approaches we’ve described today, will help make that future possible.
We still have a long way to go before robots can grasp and handle objects with the ease and precision of people, but we’re making significant progress, and each groundbreaking innovation is another step in the right direction.
Acknowledgements
The authors of DemoStart: Maria Bauza, Jose Enrique Chen, Valentin Dalibard, Nimrod Gileadi, Roland Hafner, Antoine Laurens, Murilo F. Martins, Joss Moore, Rugile Pevceviciute, Dushyant Rao, Martina Zambelli, Martin Riedmiller, Jon Scholz, Konstantinos Bousmalis, Francesco Nori, Nicolas Heess.
The authors of Aloha Unleashed: Tony Z. Zhao, Jonathan Tompson, Danny Driess, Pete Florence, Kamyar Ghasemipour, Chelsea Finn, Ayzaan Wahid.
Updated Gemini Models, Reduced Pro Pricing, Increased Rate Limits
- >50% reduced price on 1.5 Pro (both input and output for prompts <128K)
- 2x higher rate limits on 1.5 Flash and ~3x higher on 1.5 Pro
- 2x faster output and 3x lower latency
- Updated default filter settings
These new models build on our latest experimental model releases and include meaningful improvements to the Gemini 1.5 models released at Google I/O in May. Developers can access our latest models for free via Google AI Studio and the Gemini API. For larger organizations and Google Cloud customers, the models are also available on Vertex AI.
Improved overall quality, with larger gains in math, long context, and vision
The Gemini 1.5 series are models that are designed for general performance across a wide range of text, code, and multimodal tasks. For example, Gemini models can be used to synthesize information from 1000 page PDFs, answer questions about repos containing more than 10 thousand lines of code, take in hour long videos and create useful content from them, and more.
With the latest updates, 1.5 Pro and Flash are now better, faster, and more cost-efficient to build with in production. We see a ~7% increase in MMLU-Pro, a more challenging version of the popular MMLU benchmark. On MATH and HiddenMath (an internal holdout set of competition math problems) benchmarks, both models have made a considerable ~20% improvement. For vision and code use cases, both models also perform better (ranging from ~2-7%) across evals measuring visual understanding and Python code generation.
We also improved the overall helpfulness of model responses, while continuing to uphold our content safety policies and standards. This means less punting/fewer refusals and more helpful responses across many topics.
Both models now have a more concise style in response to developer feedback which is intended to make these models easier to use and reduce costs. For use cases like summarization, question answering, and extraction, the default output length of the updated models is ~5-20% shorter than previous models. For chat-based products where users might prefer longer responses by default, you can read our prompting strategies guide to learn more about how to make the models more verbose and conversational.
For more details on migrating to the latest versions of Gemini 1.5 Pro and 1.5 Flash, check out the Gemini API models page.
Gemini 1.5 Pro
We continue to be blown away with the creative and useful applications of Gemini 1.5 Pro’s 2 million token long context window and multimodal capabilities. From video understanding to processing 1000 page PDFs, there are so many new use cases still to be built. Today we are announcing a 64% price reduction on input tokens, a 52% price reduction on output tokens, and a 64% price reduction on incremental cached tokens for our strongest 1.5 series model, Gemini 1.5 Pro, effective October 1st, 2024, on prompts less than 128K tokens. Coupled with context caching, this continues to drive the cost of building with Gemini down.
Increased rate limits
To make it even easier for developers to build with Gemini, we are increasing the paid tier rate limits for 1.5 Flash to 2,000 RPM and increasing 1.5 Pro to 1,000 RPM, up from 1,000 and 360, respectively. In the coming weeks, we expect to continue to increase the Gemini API rate limits so developers can build more with Gemini.
2x faster output and 3x less latency
Along with core improvements to our latest models, over the last few weeks we have driven down the latency with 1.5 Flash and significantly increased the output tokens per second, enabling new use cases with our most powerful models.
Updated filter settings
Since the first launch of Gemini in December of 2023, building a safe and reliable model has been a key focus. With the latest versions of Gemini (-002 models), we’ve made improvements to the model’s ability to follow user instructions while balancing safety. We will continue to offer a suite of safety filters that developers may apply to Google’s models. For the models released today, the filters will not be applied by default so that developers can determine the configuration best suited for their use case.
Gemini 1.5 Flash-8B Experimental updates
We are releasing a further improved version of the Gemini 1.5 model we announced in August called “Gemini-1.5-Flash-8B-Exp-0924.” This improved version includes significant performance increases across both text and multimodal use cases. It is available now via Google AI Studio and the Gemini API.
The overwhelmingly positive feedback developers have shared about 1.5 Flash-8B has been incredible to see, and we will continue to shape our experimental to production release pipeline based on developer feedback.
We’re excited about these updates and can’t wait to see what you’ll build with the new Gemini models! And for Gemini Advanced users, you will soon be able to access a chat optimized version of Gemini 1.5 Pro-002.
Google DeepMind at NeurIPS 2024
Advancing adaptive AI agents, empowering 3D scene creation, and innovating LLM training for a smarter, safer future
Next week, AI researchers worldwide will gather for the 38th Annual Conference on Neural Information Processing Systems (NeurIPS), taking place December 10-15 in Vancouver,
Two papers led by Google DeepMind researchers will be recognized with Test of Time awards for their “undeniable influence” on the field. Ilya Sutskever will present on Sequence to Sequence Learning with Neural Networks which was co-authored with Google DeepMind VP of Drastic Research, Oriol Vinyals, and Distinguished Scientist Quoc V. Le. Google DeepMind Scientists Ian Goodfellow and David Warde-Farley will present on Generative Adversarial Nets.
We’ll also show how we translate our foundational research into real-world applications, with live demonstrations including Gemma Scope, AI for music generation, weather forecasting and more.
Teams across Google DeepMind will present more than 100 new papers on topics ranging from AI agents and generative media to innovative learning approaches.
- Google DeepMind at NeurIPS 2024 schedule
- Google Research at NeurIPS 2024 schedule
Building adaptive, smart, and safe AI Agents
LLM-based AI agents are showing promise in carrying out digital tasks via natural language commands. Yet their success depends on precise interaction with complex user interfaces, which requires extensive training data. With AndroidControl, we share the most diverse control dataset to date, with over 15,000 human-collected demos across more than 800 apps. AI agents trained using this dataset showed significant performance gains which we hope helps advance research into more general AI agents.
For AI agents to generalize across tasks, they need to learn from each experience they encounter. We present a method for in-context abstraction learning that helps agents grasp key task patterns and relationships from imperfect demos and natural language feedback, enhancing their performance and adaptability.
A frame from a video demonstration of someone making a sauce, with individual elements identified and numbered. ICAL is able to extract the important aspects of the process
Developing agentic AI that works to fulfill users’ goals can help make the technology more useful, but alignment is critical when developing AI that acts on our behalf. To that end, we propose a theoretical method to measure an AI system’s goal-directedness, and also show how a model’s perception of its user can influence its safety filters. Together, these insights underscore the importance of robust safeguards to prevent unintended or unsafe behaviors, ensuring that AI agents’ actions remain aligned with safe, intended uses.
Advancing 3D scene creation and simulation
As demand for high-quality 3D content grows across industries like gaming and visual effects, creating lifelike 3D scenes remains costly and time-intensive. Our recent work introduces novel 3D generation, simulation, and control approaches, streamlining content creation for faster, more flexible workflows.
Producing high-quality, realistic 3D assets and scenes often requires capturing and modeling thousands of 2D photos. We showcase CAT3D, a system that can create 3D content in as little as a minute, from any number of images — even just one image, or a text prompt. CAT3D accomplishes this with a multi-view diffusion model that generates additional consistent 2D images from many different viewpoints, and uses those generated images as input for traditional 3D modelling techniques. Results surpass previous methods in both speed and quality.
CAT3D enables 3D scene creation from any number of generated or real images.
Left to right: Text-to-image-to-3D, a real photo to 3D, several photos to 3D.
Simulating scenes with many rigid objects, like a cluttered tabletop or tumbling Lego bricks, also remains computationally intensive. To overcome this roadblock, we present a new technique called SDF-Sim that represents object shapes in a scalable way, speeding up collision detection and enabling efficient simulation of large, complex scenes.
A complex simulation of shoes falling and colliding, accurately modelled using SDF-Sim
AI image generators based on diffusion models struggle to control the 3D position and orientation of multiple objects. Our solution, Neural Assets, introduces object-specific representations that capture both appearance and 3D pose, learned through training on dynamic video data. Neural Assets enables users to move, rotate, or swap objects across scenes—a useful tool for animation, gaming, and virtual reality.
Given a source image and object 3D bounding boxes, we can translate, rotate, and rescale the object, or transfer objects or backgrounds between images
Improving how LLMs learn and respond
We’re also advancing how LLMs train, learn, and respond to users, improving performance and efficiency on several fronts.
With larger context windows, LLMs can now learn from potentially thousands of examples at once — known as many-shot in-context learning (ICL). This process boosts model performance on tasks like math, translation, and reasoning, but often requires high-quality, human-generated data. To make training more cost-effective, we explore methods to adapt many-shot ICL that reduce reliance on manually curated data. There is so much data available for training language models, the main constraint for teams building them becomes the available compute. We address an important question: with a fixed compute budget, how do you choose the right model size to achieve the best results?
Another innovative approach, which we call Time-Reversed Language Models (TRLM), explores pretraining and finetuning an LLM to work in reverse. When given traditional LLM responses as input, a TRLM generates queries that might have produced those responses. When paired with a traditional LLM, this method not only helps ensure responses follow user instructions better, but also improves the generation of citations for summarized text, and enhances safety filters against harmful content.
Curating high-quality data is vital for training large AI models, but manual curation is difficult at scale. To address this, our Joint Example Selection (JEST) algorithm optimizes training by identifying the most learnable data within larger batches, enabling up to 13× fewer training rounds and 10× less computation, outperforming state-of-the-art multimodal pretraining baselines.
Planning tasks are another challenge for AI, particularly in stochastic environments, where outcomes are influenced by randomness or uncertainty. Researchers use various inference types for planning, but there’s no consistent approach. We demonstrate that planning itself can be viewed as a distinct type of probabilistic inference and propose a framework for ranking different inference techniques based on their planning effectiveness.
Bringing together the global AI community
We’re proud to be a Diamond Sponsor of the conference, and support Women in Machine Learning, LatinX in AI and Black in AI in building communities around the world working in AI, machine learning and data science.
If you’re at NeurIPs this year, swing by the Google DeepMind and Google Research booths to explore cutting-edge research in demos, workshops and more throughout the conference.
New Generative AI Tools Open the Doors of Music Creation
Our latest AI music technologies are now available in MusicFX DJ, Music AI Sandbox and YouTube Shorts
For nearly a decade, our teams have been exploring how artificial intelligence (AI) can support the creative process, building tools that empower enthusiasts and professionals to discover new forms of creative expression.
Over the past year, we’ve been working in close collaboration with partners across the music industry through our Music AI Incubator and more. Their input has been guiding our state-of-the-art generative music experiments, and helping us ensure that our new generative AI tools responsibly open the doors of music creation to everyone.
Today, in partnership with Google Labs, we’re releasing a reimagined experience for MusicFX DJ that makes it easier for anyone to generate music, interactively, in real time.
We’re also announcing updates to our music AI toolkit, called Music AI Sandbox, and highlighting our latest AI music technologies in YouTube’s Dream Track, a suite of experiments that creators can use to generate high-quality instrumentals for their Shorts and videos.
Generating live music with MusicFX DJ
At I/O this year, we shared an early preview of MusicFX DJ, a digital tool that anyone can play like an instrument, making the joy of live music creation more accessible to people of all skill levels.
Today, we’re introducing a number of updates to MusicFX DJ, including an expanded set of intuitive controls, a reimagined interface, improved audio quality and new model behaviors. These capabilities let players generate and steer a continuous flow of music, share their creations with friends and play a jam session together.
Working in close collaboration with Jacob Collier — a six-time GRAMMY award-winning singer, songwriter, producer and multi-instrumentalist — we designed these updates to make MusicFX DJ more accessible, useful and inspiring.
Jacob Collier
YouTube’s Dream Track experiment now generates instrumental soundtracks
Building off our ongoing work with YouTube, we’ve evolved our Dream Track experiment to allow U.S. creators to explore a range of genres and prompts that generate instrumental soundtracks with powerful text-to-music models.
Our latest music generation models are trained with a novel reinforcement learning approach to have higher audio quality, while also paying better attention to the nuances of a user’s text prompts. Responsibly deploying generative technologies is core to our values, so all music generated by MusicFX DJ and Dream Track is watermarked using SynthID.
Building the future of music creation together
We’ve been delighted to work with partners in the music community over the past year to help build technology that’s both responsive to the needs of professionals and expands access for the next generation of musicians.
We’re looking forward to deepening these partnerships as we build the future of music creation together, developing even better tools to inspire creativity.
This work was made possible by core research and engineering efforts from Andrea Agostinelli, Zalán Borsos, George Brower, Antoine Caillon, Cătălina Cangea, Noah Constant, Michael Chang, Chris Deaner, Timo Denk, Chris Donahue, Michael Dooley, Jesse Engel, Christian Frank, Beat Gfeller, Tobenna Peter Igwe, Drew Jaegle, Matej Kastelic, Kazuya Kawakami, Pen Li, Ethan Manilow, Yotam Mann, Colin McArdell, Brian McWilliams, Adam Roberts, Matt Sharifi, Ian Simon, Ondrej Skopek, Marco Tagliasacchi, Cassie Tarakajian, Alex Tudor, Victor Ungureanu, Mauro Verzetti, Damien Vincent, Luyu Wang, Björn Winkler, Yan Wu, and Mauricio Zuluaga.
MusicFX DJ was developed by Antoine Caillon, Noah Constant, Jesse Engel, Alberto Lalama, Hema Manickavasagam, Adam Roberts, Ian Simon, and Cassie Tarakajian in collaboration with our partners from Google Labs including Obed Appiah-Agyeman, Tahj Atkinson, Carlie de Boer, Phillip Campion, Sai Kiran Gorthi, Kelly Lau-Kee, Elias Roman, Noah Semus, Trond Wuellner, Kristin Yim, and Jamie Zyskowski. We give our deepest thanks to Jacob Collier, Ben Bloomberg, and Fran Haincourt for their valuable feedback throughout the development process.
Music AI Sandbox was developed by Andrea Agostinelli, George Brower, Ross Cairns (xWF), Michael Chang, Yeawon Choi, Chris Deaner, Jesse Engel, Reed Enger, Beat Gfeller, Tom Hume, Tom Jenkins, Max Edelmann (xWF), Drew Jaegle, Jacob Kelly, DY Kim, David Madras, Hema Manickavasagam, Ethan Manilow, Yotam Mann, Colin McArdell, Chris Reardon, Felix Riedel, Adam Roberts, Arathi Sethumadhavan, Eleni Shaw, Sage Stevens, Amy Stuart, Luyu Wang, Pawel Wluka, and Yan Wu in collaboration with our partners in YouTube and Tech & Society.
Dream Track was developed by Andrea Agostinelli, Zalán Borsos, Geoffrey Cideron, Timo Denk, Michael Dooley, Christian Frank, Sertan Girgin, Myriam Hamed Torres, Matej Kastelic, Pen Li, Brian McWilliams, Matt Sharifi, Ondrej Skopek, Marco Tagliasacchi, Mauro Verzetti, Mauricio Zuluaga, in collaboration with our partners in YouTube.
Special thanks to Aäron van den Oord, Tom Hume, Douglas Eck, Eli Collins, Mira Lane, Koray Kavukcuoglu, and Demis Hassabis for their insightful guidance and support throughout the research process. Thanks to Mahyar Bordbar and DY Kim for helping coordinate these efforts, as well as the YouTube Artist Partnerships team for their support partnering with the music industry.
We also acknowledge the many other individuals who contributed across Google DeepMind and Alphabet, including our partners at YouTube.
Pushing the Frontiers of Audio Generation
Helping people around the world
Our pioneering speech generation technologies are helping people around the world interact with more natural, conversational and intuitive digital assistants and AI tools.
Speech is central to human connection. It helps people around the world exchange information and ideas, express emotions and create mutual understanding. As our technology built for generating natural, dynamic voices continues to improve, we’re unlocking richer, more engaging digital experiences.
Over the past few years, we’ve been pushing the frontiers of audio generation, developing models that can create high quality, natural speech from a range of inputs, like text, tempo controls and particular voices. This technology powers single-speaker audio in many Google products and experiments — including Gemini Live, Project Astra, Journey Voices and YouTube’s auto dubbing — and is helping people around the world interact with more natural, conversational and intuitive digital assistants and AI tools.
Working together with partners across Google, we recently helped develop two new features that can generate long-form, multi-speaker dialogue for making complex content more accessible:
- NotebookLM Audio Overviews turns uploaded documents into engaging and lively dialogue. With one click, two AI hosts summarize user material, make connections between topics and banter back and forth.
- Illuminate creates formal AI-generated discussions about research papers to help make knowledge more accessible and digestible.
Here, we provide an overview of our latest speech generation research underpinning all of these products and experimental tools.
Pioneering techniques for audio generation
For years, we’ve been investing in audio generation research and exploring new ways for generating more natural dialogue in our products and experimental tools. In our previous research on SoundStorm, we first demonstrated the ability to generate 30-second segments of natural dialogue between multiple speakers.
This extended our earlier work, SoundStream and AudioLM, which allowed us to apply many text-based language modeling techniques to the problem of audio generation.
SoundStream is a neural audio codec that efficiently compresses and decompresses an audio input, without compromising its quality. As part of the training process, SoundStream learns how to map audio to a range of acoustic tokens. These tokens capture all of the information needed to reconstruct the audio with high fidelity, including properties such as prosody and timbre.
AudioLM treats audio generation as a language modeling task to produce the acoustic tokens of codecs like SoundStream. As a result, the AudioLM framework makes no assumptions about the type or makeup of the audio being generated, and can flexibly handle a variety of sounds without needing architectural adjustments — making it a good candidate for modeling multi-speaker dialogues.
Example of a multi-speaker dialogue generated by NotebookLM Audio Overview, based on a few potato-related documents.
Building upon this research, our latest speech generation technology can produce 2 minutes of dialogue, with improved naturalness, speaker consistency and acoustic quality, when given a script of dialogue and speaker turn markers. The model also performs this task in under 3 seconds on a single Tensor Processing Unit (TPU) v5e chip, in one inference pass. This means it generates audio over 40-times faster than real time.
Scaling our audio generation models
Scaling our single-speaker generation models to multi-speaker models then became a matter of data and model capacity. To help our latest speech generation model produce longer speech segments, we created an even more efficient speech codec for compressing audio into a sequence of tokens, in as low as 600 bits per second, without compromising the quality of its output.
The tokens produced by our codec have a hierarchical structure and are grouped by time frames. The first tokens within a group capture phonetic and prosodic information, while the last tokens encode fine acoustic details.
Even with our new speech codec, producing a 2-minute dialogue requires generating over 5000 tokens. To model these long sequences, we developed a specialized Transformer architecture that can efficiently handle hierarchies of information, matching the structure of our acoustic tokens.
With this technique, we can efficiently generate acoustic tokens that correspond to the dialogue, within a single autoregressive inference pass. Once generated, these tokens can be decoded back into an audio waveform using our speech codec.
Animation showing how our speech generation model produces a stream of audio tokens autoregressively, which are decoded back to a waveform consisting of a two-speaker dialogue.
To teach our model how to generate realistic exchanges between multiple speakers, we pretrained it on hundreds of thousands of hours of speech data. Then we finetuned it on a much smaller dataset of dialogue with high acoustic quality and precise speaker annotations, consisting of unscripted conversations from a number of voice actors and realistic disfluencies — the “umm”s and “aah”s of real conversation. This step taught the model how to reliably switch between speakers during a generated dialogue and to output only studio quality audio with realistic pauses, tone and timing.
In line with our AI Principles and our commitment to developing and deploying AI technologies responsibly, we’re incorporating our SynthID technology to watermark non-transient AI-generated audio content from these models, to help safeguard against the potential misuse of this technology.
New speech experiences ahead
We’re now focused on improving our model’s fluency, acoustic quality and adding more fine-grained controls for features, like prosody, while exploring how best to combine these advances with other modalities, such as video.
The potential applications for advanced speech generation are vast, especially when combined with our Gemini family of models. From enhancing learning experiences to making content more universally accessible, we’re excited to continue pushing the boundaries of what’s possible with voice-based technologies.
Acknowledgements
Authors of this work: Zalán Borsos, Matt Sharifi, Brian McWilliams, Yunpeng Li, Damien Vincent, Félix de Chaumont Quitry, Martin Sundermeyer, Eugene Kharitonov, Alex Tudor, Victor Ungureanu, Karolis Misiunas, Sertan Girgin, Jonas Rothfuss, Jake Walker and Marco Tagliasacchi.
We thank Leland Rechis, Ralph Leith, Paul Middleton, Poly Pata, Minh Truong and RJ Skerry-Ryan for their critical efforts on dialogue data.
We’re very grateful to our collaborators across Labs, Illuminate, Cloud, Speech and YouTube for their outstanding work bringing these models into products.
We also thank Françoise Beaufays, Krishna Bharat, Tom Hume, Simon Tokumine, James Zhao for their guidance on the project.
GenCast Predicts Weather and the Risks of Extreme Conditions with state-of-the-art Accuracy
Technologies | Authors: Ilan Price and Matthew Willson
New AI model advances the prediction of weather uncertainties and risks, delivering faster, more accurate forecasts up to 15 days ahead
Weather impacts all of us — shaping our decisions, our safety, and our way of life. As climate change drives more extreme weather events, accurate and trustworthy forecasts are more essential than ever. Yet, weather cannot be predicted perfectly, and forecasts are especially uncertain beyond a few days.
Because a perfect weather forecast is not possible, scientists and weather agencies use probabilistic ensemble forecasts, where the model predicts a range of likely weather scenarios. Such ensemble forecasts are more useful than relying on a single forecast, as they provide decision makers with a fuller picture of possible weather conditions in the coming days and weeks and how likely each scenario is.
Today, in a paper published in Nature, we present GenCast, our new high resolution (0.25°) AI ensemble model. GenCast provides better forecasts of both day-to-day weather and extreme events than the top operational system, the European Centre for Medium-Range Weather Forecasts’ (ECMWF) ENS, up to 15 days in advance. We’ll be releasing our model’s code, weights, and forecasts, to support the wider weather forecasting community.
The Evolution of AI Weather Models
GenCast marks a critical advance in AI-based weather prediction that builds on our previous weather model, which was deterministic, and provided a single, best estimate of future weather. By contrast, a GenCast forecast comprises an ensemble of 50 or more predictions, each representing a possible weather trajectory.
GenCast is a diffusion model, the type of generative AI model that underpins the recent, rapid advances in image, video and music generation. However, GenCast differs from these, in that it’s adapted to the spherical geometry of the Earth, and learns to accurately generate the complex probability distribution of future weather scenarios when given the most recent state of the weather as input.
To train GenCast, we provided it with four decades of historical weather data from ECMWF’s ERA5 archive. This data includes variables such as temperature, wind speed, and pressure at various altitudes. The model learned global weather patterns, at 0.25° resolution, directly from this processed weather data.
Setting a New Standard for Weather Forecasting
To rigorously evaluate GenCast’s performance, we trained it on historical weather data up to 2018, and tested it on data from 2019. GenCast showed better forecasting skill than ECMWF’s ENS, the top operational ensemble forecasting system that many national and local decisions depend upon every day.
We comprehensively tested both systems, looking at forecasts of different variables at different lead times — 1320 combinations in total. GenCast was more accurate than ENS on 97.2% of these targets, and on 99.8% at lead times greater than 36 hours.
Better forecasts of extreme weather, such as heat waves or strong winds, enable timely and cost-effective preventative actions. GenCast offers greater value than ENS when making decisions about preparations for extreme weather, across a wide range of decision-making scenarios.
An ensemble forecast expresses uncertainty by making multiple predictions that represent different possible scenarios. If most predictions show a cyclone hitting the same area, uncertainty is low. But if they predict different locations, uncertainty is higher. GenCast strikes the right balance, avoiding both overstating or understating its confidence in its forecasts.
It takes a single Google Cloud TPU v5 just 8 minutes to produce one 15-day forecast in GenCast’s ensemble, and every forecast in the ensemble can be generated simultaneously, in parallel. Traditional physics-based ensemble forecasts such as those produced by ENS, at 0.2° or 0.1° resolution, take hours on a supercomputer with tens of thousands of processors.
Advanced Forecasts for Extreme Weather Events
More accurate forecasts of risks of extreme weather can help officials safeguard more lives, avert damage, and save money. When we tested GenCast’s ability to predict extreme heat and cold, and high wind speeds, GenCast consistently outperformed ENS.
Now consider tropical cyclones, also known as hurricanes and typhoons. Getting better and more advanced warnings of where they’ll strike land is invaluable. GenCast delivers superior predictions of the tracks of these deadly storms.
GenCast’s ensemble forecast shows a wide range of possible paths for Typhoon Hagibis seven days in advance, but the spread of predicted paths tightens over several days into a high-confidence, accurate cluster as the devastating cyclone approaches the coast of Japan.
Better forecasts could also play a key role in other aspects of society, such as renewable energy planning. For example, improvements in wind-power forecasting directly increase the reliability of wind-power as a source of sustainable energy, and will potentially accelerate its adoption. In a proof-of-principle experiment that analyzed predictions of the total wind power generated by groupings of wind farms all over the world, GenCast was more accurate than ENS.
Next Generation Forecasting and Climate Understanding at Google
GenCast is part of Google’s growing suite of next-generation AI-based weather models, including Google DeepMind’s AI-based deterministic medium-range forecasts, and Google Research’s NeuralGCM, SEEDS, and floods models. These models are starting to power user experiences on Google Search and Maps, and improving the forecasting of precipitation, wildfires, flooding and extreme heat.
We deeply value our partnerships with weather agencies, and will continue working with them to develop AI-based methods that enhance their forecasting. Meanwhile, traditional models remain essential for this work. For one thing, they supply the training data and initial weather conditions required by models such as GenCast. This cooperation between AI and traditional meteorology highlights the power of a combined approach to improve forecasts and better serve society.
To foster wider collaboration and help accelerate research and development in the weather and climate community, we’ve made GenCast an open model and released its code and weights, as we did for our deterministic medium-range global weather forecasting model.
We’ll soon be releasing real-time and historical forecasts from GenCast, and previous models, which will enable anyone to integrate these weather inputs into their own models and research workflows.
We are eager to engage with the wider weather community, including academic researchers, meteorologists, data scientists, renewable energy companies, and organizations focused on food security and disaster response. Such partnerships offer deep insights and constructive feedback, as well as invaluable opportunities for commercial and non-commercial impact, all of which are critical to our mission to apply our models to benefit humanity.
- Read our paper
Acknowledgements
We would like to recognize Raia Hadsell for supporting this work. We are grateful to Molly Beck for providing legal support; Ben Gaiarin, Roz Onions and Chris Apps for providing licensing support; Matthew Chantry, Peter Dueben and the dedicated team at the ECMWF for their help and feedback; and to our Nature reviewers for their careful and constructive feedback.
This work reflects the contributions of the paper’s co-authors: Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom Andersson, Andrew El-Kadi, Dominic Masters, Timo Ewalds, Jacklynn Stott, Shakir Mohamed, Peter Battaglia, Remi Lam, and Matthew Willson.
A New Benchmark for Evaluating the Factuality of LLM’s
Our comprehensive benchmark and online leaderboard offer a much-needed measure of how accurately LLMs ground their responses in provided source material and avoid hallucinations
Large language models (LLMs) are transforming how we access information, yet their grip on factual accuracy remains imperfect. They can “hallucinate” false information, particularly when given complex inputs. In turn, this can erode trust in LLMs and limit their applications in the real world.
Today, we’re introducing FACTS Grounding, a comprehensive benchmark for evaluating the ability of LLMs to generate responses that are not only factually accurate with respect to given inputs, but also sufficiently detailed to provide satisfactory answers to user queries.
We hope our benchmark will spur industry-wide progress on factuality and grounding. To track progress, we’re also launching the FACTS leaderboard on Kaggle. We’ve already tested leading LLMs using FACTS Grounding and have populated the initial leaderboard with their grounding scores. We will maintain and update the leaderboard as the field advances.
Current leaderboard ranking
FACTS Grounding dataset
To accurately evaluate the factuality and grounding of any given LLM, the FACTS Grounding dataset comprises 1,719 examples, each carefully crafted to require long-form responses grounded in the context document provided. Each example comprises a document, a system instruction requiring the LLM to exclusively reference the provided document, and an accompanying user request.
An example from the FACTS Grounding dataset
All examples are divided into a “public” set (860) and a “private” (859) held out set. We are releasing the public set today so anyone can use it to evaluate an LLM. Of course, we know that issues of benchmark contamination and leaderboard hacking are important to protect against, so following standard industry practice, we are keeping the private evaluation set held out. The FACTS leaderboard scores are the average performance across both public and private sets.
To ensure a diversity of inputs, the FACTS Grounding examples include documents with a variety of lengths, up to a maximum of 32,000 tokens (roughly 20,000 words), covering domains such as finance, technology, retail, medicine, and law. The user requests are similarly wide ranging, including requests for summarization, Q&A generation, and rewriting tasks. We did not include any examples that could require creativity, mathematics, or complex reasoning – capabilities which might require the model to apply more advanced reasoning in addition to grounding.
Prompt distribution
Collective judgement by leading LLMs
To succeed on a given example, an LLM must synthesize the complex information in the document and generate a long-form response that is both a comprehensive answer to the user request and fully attributable to that document.
FACTS Grounding evaluates model responses automatically using three frontier LLM judges — namely Gemini 1.5 Pro, GPT-4o, and Claude 3.5 Sonnet. We selected a combination of different judges to mitigate any potential bias of a judge giving higher scores to the responses produced by a member of its own model family. The automatic judge models were comprehensively evaluated against a held-out test set to find the best performing judging prompt templates and to verify agreement with human raters.
Each FACTS Grounding example is judged in two phases. First, responses are evaluated for eligibility, and disqualified if they don’t sufficiently address the user’s request. Second, responses are judged as factually accurate if they are fully grounded in information contained in the provided document, with no hallucinations.
With the eligibility and grounding accuracy of a given LLM response evaluated separately by multiple AI judge models, the results are then aggregated to determine if the LLM has dealt with the example successfully. The final score for the overall grounding task is the average of all judge models’ scores across all examples. Find more details of our FACTS Grounding evaluation methodology in our paper.
A factually correct response that fails to properly address the user’s request fails the benchmarking example. Here we see three instances of model responses that the automated LLM judges considered ineligible
FACTS Grounding will continue to evolve
We are mindful that benchmarks can be quickly overtaken by progress, so this launch of our FACTS Grounding benchmark and leaderboard is just the beginning. Factuality and grounding are among the key factors that will shape the future success and usefulness of LLMs and broader AI systems, and we aim to grow and iterate FACTS Grounding as the field progresses, continually raising the bar.
We encourage the AI community to engage with FACTS Grounding, evaluate their models on the open set of examples or to submit their models for evaluation. We believe that comprehensive benchmarking methods, coupled with continuous research and development will continue to improve AI systems.
Acknowledgements
FACTS is a collaboration between Google DeepMind and Google Research.
FACTS Grounding was led by: Alon Jacovi, Andrew Wang, Chris Alberti, Connie Tao, Dipanjan Das, Jon Lipovetz, Kate Olszewska, Lukas Haas, Michelle Liu, and Nate Keating.
We are also very grateful for contributions from: Adam Bloniarz, Carl Saroufim, Corey Fry, Dror Marcus, Doron Kukliansky, Gaurav Singh Tomar, James Swirhun, Jinwei Xing, Lily Wang, Madhu Gurumurthy, Michael Aaron, Moran Ambar, Rachana Fellinger, Rui Wang, Zizhao Zhang, and Sasha Goldshtein.
We would also like to thank Avinatan Hassidim, D. Sculley, Fernando Pereira, Koray Kavukcuoglu, Slav Petrov, Ya Xu, and Yossi Matias for their continued support.