Click any tag below to further narrow down your results
Links
This article discusses the development of a Software Factory that leverages AI for non-interactive coding. It focuses on using scenarios instead of traditional tests and introduces the Digital Twin Universe for validating software against behavioral clones of services.
This article discusses how AI tools necessitate stricter coding practices to produce high-quality software. It emphasizes the importance of 100% code coverage, thoughtful file organization, and automated best practices to support AI in writing effective code.
This article discusses challenges faced by AI agents when performing long tasks across multiple sessions without memory. It introduces a two-part solution using initializer and coding agents to ensure consistent progress, effective environment setup, and structured updates to maintain project integrity.
This article promotes a 30-minute demo of Momentic's AI-powered testing platform. It aims to explore current QA practices at your organization and discuss how Momentic can help achieve your team's objectives.
The article discusses how AI changes the landscape of code reviews, making the reviewer's job more complex. It outlines specific heuristics for assessing pull requests (PRs), focusing on aspects like design, testing, error handling, and the effort put in by the author. The author emphasizes the need for human oversight despite advances in AI review tools.
Tristan Hume discusses the evolution of a take-home test developed for hiring performance engineers at Anthropic. As AI models like Claude have improved, the test has been repeatedly redesigned to maintain its effectiveness in distinguishing human talent from AI capabilities. The article also shares insights from the original design and the challenges posed by increasingly capable AI systems.
This article discusses the challenges of using AI to generate code for distributed systems, emphasizing that traditional coding practices can lead to bugs that are hard to catch. It argues for frameworks like Hydro that make distributed behavior explicit and aim to reduce these bugs by design, rather than relying solely on testing.
This article explains how AI is changing the code review process, emphasizing the need for evidence of code functionality rather than just relying on AI-generated outputs. It contrasts solo developers’ fast-paced workflows with team dynamics, where human judgment remains essential for quality and security. The piece outlines best practices for integrating AI into development and review processes.
Gotests is a tool that automatically generates table-driven tests for Go functions and methods by analyzing their signatures. It supports filtering, custom templates, and even AI-generated test cases, making it efficient for developers to ensure test coverage.
The article details how an AI coding agent inadvertently led to an infinite recursion bug in a web application. A crucial comment was deleted during a UI refactor, resulting in a missing safety constraint that triggered browsers to freeze and crash. The author emphasizes the importance of tests over comments in an AI-augmented coding environment.
This article discusses how AI is changing the code review process for both solo developers and teams. It emphasizes the need for evidence of working code, highlights the risks of relying too heavily on AI, and outlines best practices for integrating AI into code reviews while maintaining human oversight.
Terminal-Bench 2.0 launches with a new testing framework, Harbor, aimed at improving the evaluation of AI agents in terminal-based tasks. The update includes 89 validated tasks and addresses previous inconsistencies, while Harbor supports scalable testing in cloud environments.
Claude is being tested as a Chrome extension to enhance browser-based AI capabilities while addressing security risks like prompt injection. The pilot aims to gather feedback on safety and usability before a broader release, with participants having control over what Claude can do and access.
The article details an attempt to recreate the 1996 Space Jam website using AI called Claude. Despite providing assets and guidance, Claude struggles with accuracy and measurement, producing increasingly incorrect versions of the layout. The author documents the process and frustrations of working with AI on this project.
Codacy introduces a hybrid code review engine that enhances Pull Request feedback by identifying logic gaps, security issues, and code complexity. It automates the review process, letting developers ship code faster and with more confidence.
This article outlines how Qodo developed a benchmark to evaluate AI code review systems. It highlights a new methodology that injects defects into real pull requests to assess both bug detection and code quality, demonstrating superior results compared to other platforms.
The article explores how AI coding agents, like the Ralph Wiggum loop, automate software development by using clear specifications and robust testing. It highlights Simon Willison's success in creating an HTML5 parser while multitasking, demonstrating the potential of agents to handle complex tasks autonomously. The key lies in defining success criteria and verifying results efficiently.
Vijil provides a framework for building reliable, secure, and compliant AI agents. It addresses enterprise concerns about trust through hardened models, continuous testing, and adaptive defenses, helping organizations deploy AI solutions faster and with greater confidence.
This article discusses the benefits and challenges of using AI in programming from the perspective of a senior engineer. It shares practical tips and personal insights on how to effectively integrate AI tools into workflows while addressing common concerns about code quality and understanding.
Momentic is a platform that automates testing for software teams, allowing them to create tests using natural language. It reduces the time needed for test automation, lowers false positives, and improves release cadence. The AI-driven tool adapts to changes in the application, making QA more efficient.
QA Wolf offers an AI-driven testing platform that automates complex tests, from APIs to mobile apps. It provides features like parallel test execution, detailed bug reporting, and seamless integration with CI tools. Users benefit from real-time support and significant time savings in their QA processes.
This article discusses how Catching JiTTests, generated by large language models, streamline the testing process in fast-paced software development. Unlike traditional testing, JiTTests adapt to code changes without the need for ongoing maintenance, focusing on catching serious bugs efficiently.
The author used an AI tool to repeatedly modify a codebase, aiming to enhance its quality through an automated process. While the AI added significant lines of code and tests, many of the changes were unnecessary or unmaintainable, leaving the core functionality largely intact but cluttered. The exercise highlighted the pitfalls of prioritizing quantity over genuine quality improvements.
SmartBear AI offers tools for automating software testing and quality assurance. Users can apply for a private beta to experience AI agents that run tests and generate audit reports. The SmartBear MCP server helps teams streamline workflows and improve testing processes without extensive coding knowledge.
The article outlines a process for migrating a large codebase of frontend tests from React Testing Library v13 to v14 using AI tools. It details the challenges faced during the migration, including code changes and maintaining test coverage, while emphasizing the iterative approach taken to improve both the migration guide and the codemod.
Meticulous automates testing by monitoring user interactions and generating a comprehensive test suite. It simplifies the testing process by recording sessions and providing side-effect free tests, allowing developers to see the impact of code changes before merging.
Decipher AI automates end-to-end testing by generating tests from recorded actions or described workflows. It keeps tests updated automatically and alerts teams when users encounter bugs in production, ensuring quick resolutions and minimal disruption.
Gemini 3.0 has been spotted in A/B testing on Google AI Studio, showcasing its advanced coding performance through SVG image generation. The author tested the model by creating an SVG image of an Xbox 360 controller, noting impressive results compared to the previous Gemini 2.5 Pro model, despite longer processing times.
The article discusses the future of testing in DevOps, highlighting the trends and technologies expected to shape the landscape by 2025. It emphasizes the importance of automation, continuous testing, and collaboration among teams to enhance software quality and delivery speed. Key insights include the integration of AI and machine learning into testing processes to improve efficiency and effectiveness.
A Model Context Protocol (MCP) server has been developed to comply with the MCP 2025-03-26 specification, featuring tools, resources, prompts, and enhanced sampling capabilities. It integrates HackerNews and GitHub APIs for AI-powered analysis and demonstrates robust test coverage, although some concurrency limitations exist in certain functionalities. The server is production-ready with a rich CLI for testing and interaction.
An AI-powered tool, sqlmap-ai, enhances SQL injection testing by automating processes such as result analysis and providing step-by-step suggestions tailored to specific database management systems. It supports various AI providers and features adaptive testing, making it user-friendly for both experts and newcomers in cybersecurity.
The article discusses the integration of AI in enhancing application quality through automated test generation. It highlights the benefits of using AI tools to improve testing efficiency and accuracy, ultimately leading to better software performance and user satisfaction. The focus is on how AI can streamline the testing process and reduce the time developers spend on manual testing tasks.
A recent experience with a broken demo booking form led to the implementation of an AI browser agent to automatically test the site's functionality. This agent performs tasks like filling out forms and checking for available time slots, ensuring that the user experience is smooth and effective. The setup is quick and provides real-time alerts for any issues, enhancing the overall quality assurance process.
Google is testing a new AI mode that alters search results to encourage more clicks. This change aims to enhance user engagement and improve the overall search experience, potentially impacting how information is presented to users. The adjustments are part of Google's ongoing efforts to integrate AI into its services.
QA Wolf AI version 4.5 introduces a multi-agent system that generates Playwright tests significantly faster, reducing the time from 29 minutes to just 6 minutes. With specialized agents for outlining, coding, and verifying tests, the system achieves high accuracy and efficiency, enabling engineers to accomplish five times more work in the same period. The transparency of the agents' decision-making process ensures accountability for QA engineers and clients alike.
The article discusses the author's experience with AI-based coding, emphasizing a collaborative approach between human engineers and AI agents to enhance code quality and productivity. Despite achieving significant coding throughput, the author warns that the increased speed of commits can lead to more frequent bugs, advocating for improved testing methods to mitigate these risks.