langgraph-cli==0.4.18 released (2026-03-15) — small CLI patch and message update
A new langgraph-cli release (0.4.18) was published on 2026-03-15. The release notes indicate a small CLI patch (including an updated error message) and general release metadata. No breaking changes or deprecations are indicated in the diffed release entry.
Community reports: unexpected Tier downgrades and failures to upgrade causing quota/regime outages (March 15, 2026)
Multiple community forum threads (Google AI Developers / Build with Google AI forum) dated March 15, 2026 report that paid Gemini API projects were automatically downgraded from higher tiers to Tier 1, causing sudden lower quotas and 429 errors. Separate threads report inability to upgrade from Tier 1 to Tier 2 despite meeting spend and time requirements. These posts indicate a possible platform incident affecting quotas and tier controls.
Gemini 3.1 Flash-Lite public preview page published/updated (Last updated 2026-03-14)
A new Vertex AI docs page for "Gemini 3.1 Flash-Lite" (public preview) is present and lists model specs, supported inputs/outputs, token limits, supported consumption options, regions (global), and states the page was last updated 2026-03-14. The page indicates the model is in public preview and references Pre-GA terms; release date on the page is March 3, 2026.
User reports: Azure OpenAI Realtime API server_error, batch jobs stuck, and access issues to GPT-5.x (Microsoft Q&A reports)
Multiple Microsoft Q&A threads (mid-March 2026) report Realtime API server_error responses, batch jobs stuck in validation, and inability for some customers to access certain GPT-5.x endpoints or receive approved quota increases. These are user-reported incidents and not (yet) posted as an official Azure status incident on the public status page.
Unconfirmed reports: Opus 4.6 context window behavior changes
Community posts report that Opus 4.6 behavior around context window defaults (e.g., 1M context window) may have changed; these reports currently come from user/community threads and have not been confirmed by official Anthropic release notes or docs. The change, if true, would affect token handling and pricing/throughput assumptions.
The openai-agents-python repository released a series of versions in March 2026 (notably v0.11.0 on 2026-03-09, v0.12.0 on 2026-03-12, v0.12.1 on 2026-03-13, and v0.12.2 on 2026-03-14). Key changes include Responses API tool-search support and GA computer-tool migration, an opt-in model retry policy configuration, WebSocket mode support for the Responses API, returning McpError as a structured error result (avoiding crashes), and multiple docs/compatibility fixes (Python packaging/imports, tracing, and realtime transport guidance).
langchain-anthropic==1.3.5 released (2026-03-14) — fixes for streaming and partner bumps
LangChain published langchain-anthropic==1.3.5 on 2026-03-14 which contains multiple fixes (including support for eager_input_streaming and other Anthropic-related fixes) and dependency bumps (notably bumping langgraph and langgraph-checkpoint in the Anthropic partner libs). These changes are integration-level and affect how LangChain agents interact with Anthropic streaming and partner tooling.
Elevated errors reported on Claude Opus 4.6 and Sonnet 4.6 (service incident)
Multiple status and community posts reported elevated error rates and partial service disruption affecting Claude Opus 4.6 and Sonnet 4.6 around 2026-03-14. Reports originated from status/ticket updates and community threads indicating increased errors for API calls and consumer-facing apps; some automated posts referenced a resolved/mitigated state after initial reports.
Gemini API pricing page updated (Last updated 2026-03-13); Batch API referenced in nav
The Gemini API pricing documentation was updated on 2026-03-13: the page's "Last updated" timestamp was changed to 2026-03-13 UTC and the documentation/navigation now references the Batch API (new section/menu item). There were no numeric pricing values restored on the page — previously-observed 'Not available' entries remain.
AWS and Cerebras announced a collaboration to deploy Cerebras CS-3 systems in AWS data centers and offer a Trainium + Cerebras disaggregated inference solution accessible via Amazon Bedrock. The joint solution uses Trainium for prefill and Cerebras WSE/CS-3 for decode, promising much higher token-per-second decode throughput and up to ~5x more high-speed token capacity. AWS said the capability will be available via Bedrock in the coming months and that major open-source LLMs and Amazon Nova models will be offered on Cerebras hardware.
ChatGPT release notes: write actions added for Google and Microsoft apps (admin scopes updated)
On March 13, 2026 the ChatGPT Enterprise/Edu release notes were updated to announce support for write actions in Google and Microsoft apps. Write actions are disabled by default and must be enabled by workspace admins in Settings > Apps > Manage actions. Microsoft customers may also need Microsoft Entra admin approval for updated scopes before users can connect.
1M token context window generally available for Claude Opus 4.6 and Sonnet 4.6
Multiple community posts, release-note aggregators, and recent client/tooling release notes indicate Claude Opus 4.6 and Sonnet 4.6 now support a 1,000,000-token (1M) context window as generally available/default, and some tooling (e.g., client releases) show Opus 4.6 becoming the default model in integrations. Reports also state the 1M context is available at standard/regular pricing and that dedicated 1M rate limits were removed in the reported updates.
Reports that Azure startup credits may not apply to Claude billing
Press reporting surfaced cases where Azure startup credits did not apply to Claude usage; Anthropic stated it has no visibility into Azure billing and directed customers to contact Microsoft. The reporting highlights potential gaps between cloud provider credit programs and billed usage for third‑party AI services.
1M token context becomes generally available for Claude Opus 4.6 and Sonnet 4.6 at standard pricing
Anthropic announced that the full 1,000,000-token context window for Claude Opus 4.6 and Claude Sonnet 4.6 is now generally available and is being offered at standard API pricing. The rollout makes the 1M context variant the default for some Claude Code and higher-tier plan usage while Pro/Sonnet users may opt in or see different availability. Third‑party reports cite per‑million token rates for the 1M variants (reported: Opus 4.6 ~ $5 input / $25 output per million tokens; Sonnet 4.6 ~ $3 input / $15 output per million tokens).
Limited-time March 13–27, 2026: doubled off-peak usage for Free/Pro/Max/Team plans
Anthropic published a limited‑time promotion (Claude March 2026 usage promotion) that doubles usage limits during off‑peak hours for eligible plans. The promotion runs from March 13, 2026 through March 27, 2026, and applies to Free, Pro, Max, and Team plans (Enterprise is excluded).
New CloudWatch metrics (TimeToFirstToken, EstimatedTPMQuotaUsage) for Amazon Bedrock (Mar 13, 2026)
AWS announced two new CloudWatch metrics for Amazon Bedrock — TimeToFirstToken and EstimatedTPMQuotaUsage — which are automatically emitted to the AWS/Bedrock namespace for successful inference requests. TimeToFirstToken gives server-side streaming latency (ms) and EstimatedTPMQuotaUsage estimates Tokens-Per-Minute quota consumption (accounting for burndown multipliers and cache weighting). Both metrics are available now and designed to help set alarms, baselines, and capacity planning.
Elevated errors on Claude Opus 4.6 and Sonnet 4.6 (status incident)
Anthropic posted a status incident (incident q58b2gkv64pw) on March 13 reporting elevated errors affecting Claude Opus 4.6 and Sonnet 4.6. The incident was visible in status and was referenced by automated social posts.
Temporary 2× off-peak usage boost for Claude (Mar 13–27, 2026)
Anthropic announced a limited-time boost that temporarily doubles Claude usage limits during off-peak hours from March 13–27, 2026. The change is automatic for eligible users and applies to web and API usage windows specified in the announcement.
Secure AI agents with Policy in Amazon Bedrock AgentCore (Mar 13, 2026)
AWS published a detailed how-to for using Policy in Amazon Bedrock AgentCore (deep-dive), showing Cedar policy examples, authoring options (natural language generation, form-based, direct Cedar), policy engine lifecycle, gateway association modes (LOG_ONLY and enforce), and sample IAM permissions and cleanup steps. This expands the practical guidance after the earlier GA announcement.
Q&A: Clarification that ChatGPT retirements do not automatically remove GPT-4.1 from Azure OpenAI
A Microsoft Q&A moderator posted a clarification (2026-03-13) that the ChatGPT product retirement notices do not automatically imply the same retirement timeline for Azure OpenAI / Azure AI Foundry. Azure manages model lifecycles separately and customers will be notified in the official Azure model retirements documentation if Azure plans to retire an API-accessible model.
The openai-agents-js repository published multiple releases in March 2026 (including v0.6.0 on 2026-03-09, v0.7.0 on 2026-03-12, and v0.7.1 on 2026-03-13). Notable items include Responses tool-search support (with namespace guidance), new opt-in model retry settings (ModelSettings), import path changes for AI SDK adapter users, documentation updates (TypeScript docs and examples), and added helpers for raw model stream events and retry configuration examples.
Weaviate patch/minor releases (Mar 9–12, 2026) add Gemini Embedding 2 support and backup improvements; no breaking changes
Weaviate published a set of minor/patch releases between 2026-03-09 and 2026-03-12 (notably v1.36.3, v1.36.4, v1.36.5). The updates include file-based incremental backup improvements, backup chunking/split file handling, BM25 bug fixes, and explicit support for Google AI Studio / Gemini Embedding 2 in the multi2vec-google module (Gemini Embedding 2 multimodal support). All release notes list "none" under Breaking Changes. No licensing or pricing announcements were found in the same timeframe.
Anthropic (via code/release notes discovered in the Claude Code release stream and press coverage) increased default maximum output token limits for Claude Opus 4.6 to 64k tokens and raised the upper bound for Opus 4.6 and Sonnet 4.6 to 128k tokens to support much longer outputs.
Anthropic invests $100 million into the Claude Partner Network
Anthropic announced the Claude Partner Network, committing an initial $100 million in 2026 to support partner organizations that help enterprises adopt Claude. The program provides training, technical certification (Claude Certified Architect, Foundations available immediately), partner-facing engineering support, co-investment for go-to-market activities, and a Partner Portal; membership is free and applications open immediately.
Anthropic invests $100 million into the Claude Partner Network
Anthropic announced the Claude Partner Network and committed an initial $100 million to support partners in 2026. The program provides training, technical support, certifications, co‑marketing and direct partner investment to accelerate enterprise adoption of Claude.
langgraph==1.1.2 (2026-03-12) — bugfixes and remote-graph context added
LangGraph published langgraph==1.1.2 on 2026-03-12 which includes a fix for stream part generic ordering, adds context support for the remote graph API, and bumps a dependency (tornado). The changes are primarily bug fixes and a feature addition to improve remote graph context handling.
GPT-5.4 suddenly started generating random tokens (developer community reports)
Multiple posts on the OpenAI Developer Community (Mar 12, 2026) report that gpt-5.4 began producing scrambled or random tokens in outputs where it previously worked correctly. This is an unofficial community report (not an official OpenAI announcement) but indicates potential model degradation or an API-related issue affecting developers.
Some users seeing "No Accessible Workspaces" when attempting to login via SSO (resolved)
OpenAI’s status page shows an incident on 2026-03-12 where some users saw “No Accessible Workspaces” when attempting to log in via SSO. The team identified the issue, applied a mitigation, moved to monitoring, and then marked the incident resolved later the same day.
Gemini API pricing page updated — input price cells changed to 'Not available' (Last updated 2026-03-12)
The Gemini API pricing documentation was updated on 2026-03-12: several input pricing cells that previously showed numeric amounts (e.g., image/audio/video input prices) were replaced with the text 'Not available' and the page 'Last updated' timestamp was changed to 2026-03-12. There are also minor wording/formatting edits elsewhere on the page.
Azure Foundry model retirements page: published schedule and dates (last updated 2026-03-12)
Microsoft's Azure OpenAI / Foundry model retirements page (last updated 2026-03-12) lists lifecycle statuses, deprecation and retirement dates for many models (examples: gpt-5-chat versions retiring 2026-04-15; certain gpt-4o standard deployments retiring 2026-03-31 with auto-upgrades beginning 2026-03-09; dall-e-3 retirement noted as 2026-03-04). The page defines notification policies (60 days for GA retirements, 30 days for preview upgrades) and provides guidance for preparing upgrades and monitoring via Azure Service Health.