AI Native Developer News

Running local models on Macs gets faster with Ollama's MLX support

Ollama has introduced support for Apple's open-source MLX framework for machine learning, enhancing its runtime system for operating large language models on local computers. This update improves caching performance and adds support for Nvidia's NVFP4 format for model compression, which optimizes memory usage. These enhancements are expected to significantly boost performance on Macs with Apple Silicon chips (M1 or later). The surge in interest for local models, exemplified by OpenClaw's rapid rise to over 300,000 stars on GitHub, underscores the growing trend in using local computing resources for machine learning.

Ars Technica - AI·7h ago

ai-frameworksai-models

llm-all-models-async 0.1

The article discusses the release of 'llm-all-models-async 0.1', a significant new version that promises to enhance the performance and ease of use of various language models in asynchronous environments. This version introduces an improved API that facilitates smoother operations when working with multiple models. Specific performance metrics are not detailed, but the focus is on enabling developers to make better use of language models in their applications. The update is aimed at streamlining workflows in AI-assisted software development.

Simon Willison·9h ago

ai-modelsai-frameworks

GISTBench: Evaluating LLM User Understanding via Evidence-Based Interest Verification

The article introduces GISTBench, a benchmark designed for evaluating Large Language Models' (LLMs) capabilities in understanding user interests within recommendation systems through their interaction histories. It features two new metric families: Interest Groundedness (IG), which includes precision and recall components to penalize hallucination while rewarding coverage, and Interest Specificity (IS), which assesses distinctiveness of LLM-generated user profiles. A synthetic dataset based on real user interactions is released, containing implicit and explicit engagement signals, with validation against user surveys. The evaluation of eight open-weight LLMs, ranging from 7B to 120B parameters, uncovers significant performance bottlenecks, particularly in counting and attributing engagement signals.

arXiv CS.AI·2h ago

ai-researchai-models