Monday, 31 March 2025
New top story on Hacker News: Show HN: GuMCP – Open-source MCP servers, hosted for free
Show HN: GuMCP – Open-source MCP servers, hosted for free
19 by murb | 3 comments on Hacker News.
Hello! We open sourced all our current MCP servers to platforms like Slack, Google sheets, Linear, Perplexity and will be contributing a few more integrations every day to the project. problems we're hoping to solve: - Many people are creating MCP servers for the same apps. They're scattered across different repos but flavors of the same thing. We're making one standardized mono project for all MCP servers. - Startups are charging for hosting MCP servers. This is blocking tons of people from being able to play around with MCP casually. We're hosting them for free. - Non-technical people should be able to use MCP without needing to learn how to clone a repo and set up a venv. We're trying to enable a one click integration if people want to use the free hosted service. The plan is to keep contributing until we have an MCP server for basically every useful app anyone could want.
19 by murb | 3 comments on Hacker News.
Hello! We open sourced all our current MCP servers to platforms like Slack, Google sheets, Linear, Perplexity and will be contributing a few more integrations every day to the project. problems we're hoping to solve: - Many people are creating MCP servers for the same apps. They're scattered across different repos but flavors of the same thing. We're making one standardized mono project for all MCP servers. - Startups are charging for hosting MCP servers. This is blocking tons of people from being able to play around with MCP casually. We're hosting them for free. - Non-technical people should be able to use MCP without needing to learn how to clone a repo and set up a venv. We're trying to enable a one click integration if people want to use the free hosted service. The plan is to keep contributing until we have an MCP server for basically every useful app anyone could want.
Sunday, 30 March 2025
Saturday, 29 March 2025
Friday, 28 March 2025
Thursday, 27 March 2025
Wednesday, 26 March 2025
New top story on Hacker News: Google will develop Android OS behind closed doors starting next week
Google will develop Android OS behind closed doors starting next week
33 by josephcsible | 16 comments on Hacker News.
33 by josephcsible | 16 comments on Hacker News.
New top story on Hacker News: Botswana Successfully Launches First Satellite, Botsat-1
Botswana Successfully Launches First Satellite, Botsat-1
14 by vinnyglennon | 1 comments on Hacker News.
14 by vinnyglennon | 1 comments on Hacker News.
Tuesday, 25 March 2025
New top story on Hacker News: Show HN: Feudle – a daily puzzle game built with AI
Show HN: Feudle – a daily puzzle game built with AI
13 by papaolivia92 | 5 comments on Hacker News.
I’m a game show nerd who wanted to build a game, despite not being a developer. Using ChatGPT, I created Feudle – a blend of Family Feud and Wordle. Each day, there’s a survey question, and players try to guess the most popular responses—but the twist is that today’s answers come from real players who played yesterday. After playing, you can vote on or submit new questions, shaping future Feudles. If I were to start again, I’d explore newer AI-assisted coding tools like Cursor, but overall, this was a great learning experience. Hopefully, this inspires others to experiment with AI tools to bring their ideas to life. Any feedback? Any tips for AI tools I should explore when iterating on the game?
13 by papaolivia92 | 5 comments on Hacker News.
I’m a game show nerd who wanted to build a game, despite not being a developer. Using ChatGPT, I created Feudle – a blend of Family Feud and Wordle. Each day, there’s a survey question, and players try to guess the most popular responses—but the twist is that today’s answers come from real players who played yesterday. After playing, you can vote on or submit new questions, shaping future Feudles. If I were to start again, I’d explore newer AI-assisted coding tools like Cursor, but overall, this was a great learning experience. Hopefully, this inspires others to experiment with AI tools to bring their ideas to life. Any feedback? Any tips for AI tools I should explore when iterating on the game?
Monday, 24 March 2025
Sunday, 23 March 2025
New top story on Hacker News: Germany is unlocking billions to supercharge its military at a seismic moment
Germany is unlocking billions to supercharge its military at a seismic moment
51 by rustoo | 61 comments on Hacker News.
51 by rustoo | 61 comments on Hacker News.
Saturday, 22 March 2025
Friday, 21 March 2025
Thursday, 20 March 2025
New top story on Hacker News: Show HN: AgentKit – JavaScript Alternative to OpenAI Agents SDK with Native MCP
Show HN: AgentKit – JavaScript Alternative to OpenAI Agents SDK with Native MCP
35 by tonyhb | 10 comments on Hacker News.
Hi HN! I’m Tony, co-founder of Inngest. I wanted to share AgentKit, our Typescript multi-agent library we’ve been cooking and testing with some early users in prod for months. Although OpenAI’s Agents SDK has been launched since, we think an Agent framework should offer more deterministic and flexible routing, work with multiple model providers, embrace MCP (for rich tooling), and support the unstoppable and growing community of TypeScript AI developers by enabling a smooth transition to production use cases. This is why we are building AgentKit, and we’re really excited about it for a few reasons: Firstly, it’s simple. We embrace KISS principles brought by Anthropic and HuggingFace by allowing you to gradually add autonomy to your AgentKit program using primitives: - Agents: LLM calls that can be combined with prompts, tools, and MCP native support. - Networks: a simple way to get Agents to collaborate with a shared State, including handoff. - State: combines conversation history with a fully typed state machine, used in routing. - Routers: where the autonomy lives, from code-based to LLM-based (ex: ReAct) orchestration The routers are where the magic happens, and allow you to build deterministic, reliable, testable agents. AgentKit routing works as follows: the network calls itself in a loop, inspecting the State to determine which agents to call next using a router. The returned agent runs, then optionally updates state data using its tools. On the next loop, the network inspects state data and conversation history, and determines which new agent to run. This fully typed state machine routing allows you to deterministically build agents using any of the effective agent patterns — which means your code is easy to read, edit, understand, and debug. This also makes handoff incredibly easy: you define when agents should hand off to each other using regular code and state (or by calling an LLM in the router for AI-based routing). This is similar to the OpenAI Agents SDK but easier to manage, plan, and build. Then comes the local development and moving to production capabilities. AgentKit is compatible with Inngest’s tooling, meaning that you can test agents using Inngest’s local DevServer, which provides traces, inputs, outputs, replay, tool, and MCP inputs and outputs, and (soon) a step-over debugger so that you can easily understand and visually see what's happening in the agent loop. In production, you can also optionally combine AgentKit with Inngest for fault-tolerant execution. Each agent’s LLM call is wrapped in a step, and tools can use multiple steps to incorporate things like human-in-the-loop. This gives you native orchestration, observability, and out-of-the-box scale. You will find the documentation as an example of an AgentKit SWE-bench and multiple Coding Agent examples. It’s fully open-source under the Apache 2 license. If you want to get started: - npm: npm i @inngest/agent-kit - GitHub: https://ift.tt/tyMzYEx - Docs: https://ift.tt/zniSp6l We’re excited to finally launch AgentKit; let us know what you think!
35 by tonyhb | 10 comments on Hacker News.
Hi HN! I’m Tony, co-founder of Inngest. I wanted to share AgentKit, our Typescript multi-agent library we’ve been cooking and testing with some early users in prod for months. Although OpenAI’s Agents SDK has been launched since, we think an Agent framework should offer more deterministic and flexible routing, work with multiple model providers, embrace MCP (for rich tooling), and support the unstoppable and growing community of TypeScript AI developers by enabling a smooth transition to production use cases. This is why we are building AgentKit, and we’re really excited about it for a few reasons: Firstly, it’s simple. We embrace KISS principles brought by Anthropic and HuggingFace by allowing you to gradually add autonomy to your AgentKit program using primitives: - Agents: LLM calls that can be combined with prompts, tools, and MCP native support. - Networks: a simple way to get Agents to collaborate with a shared State, including handoff. - State: combines conversation history with a fully typed state machine, used in routing. - Routers: where the autonomy lives, from code-based to LLM-based (ex: ReAct) orchestration The routers are where the magic happens, and allow you to build deterministic, reliable, testable agents. AgentKit routing works as follows: the network calls itself in a loop, inspecting the State to determine which agents to call next using a router. The returned agent runs, then optionally updates state data using its tools. On the next loop, the network inspects state data and conversation history, and determines which new agent to run. This fully typed state machine routing allows you to deterministically build agents using any of the effective agent patterns — which means your code is easy to read, edit, understand, and debug. This also makes handoff incredibly easy: you define when agents should hand off to each other using regular code and state (or by calling an LLM in the router for AI-based routing). This is similar to the OpenAI Agents SDK but easier to manage, plan, and build. Then comes the local development and moving to production capabilities. AgentKit is compatible with Inngest’s tooling, meaning that you can test agents using Inngest’s local DevServer, which provides traces, inputs, outputs, replay, tool, and MCP inputs and outputs, and (soon) a step-over debugger so that you can easily understand and visually see what's happening in the agent loop. In production, you can also optionally combine AgentKit with Inngest for fault-tolerant execution. Each agent’s LLM call is wrapped in a step, and tools can use multiple steps to incorporate things like human-in-the-loop. This gives you native orchestration, observability, and out-of-the-box scale. You will find the documentation as an example of an AgentKit SWE-bench and multiple Coding Agent examples. It’s fully open-source under the Apache 2 license. If you want to get started: - npm: npm i @inngest/agent-kit - GitHub: https://ift.tt/tyMzYEx - Docs: https://ift.tt/zniSp6l We’re excited to finally launch AgentKit; let us know what you think!
Wednesday, 19 March 2025
New top story on Hacker News: Show HN: We built an agentic image editor that preserves the original structure
Show HN: We built an agentic image editor that preserves the original structure
9 by sakofchit | 6 comments on Hacker News.
Hi everyone, I’ve been experimenting with app where you can edit images in your camera roll simply by tweaking your photo’s metadata (changing location/time) and our agent will contextually regenerate the photo in that place & time in one shot. There's no prompting involved. One of the hardest problems we’ve seen with these ai image editing/creation tools is that they struggle with preserving the subjects of the original image (faces, genders, number of people, bodies, animals, etc), and I think we’ve gotten a step closer to making it feel more realistic. The gallery has some examples that people have been regenerating. https://ift.tt/dCgKqhI Here’s a demo: https://ift.tt/azymMrd Feel free to dm me on Twitter: https://twitter.com/sakofchit if you’d like to try out the TestFlight in the meantime Would love to know what y'all think!
9 by sakofchit | 6 comments on Hacker News.
Hi everyone, I’ve been experimenting with app where you can edit images in your camera roll simply by tweaking your photo’s metadata (changing location/time) and our agent will contextually regenerate the photo in that place & time in one shot. There's no prompting involved. One of the hardest problems we’ve seen with these ai image editing/creation tools is that they struggle with preserving the subjects of the original image (faces, genders, number of people, bodies, animals, etc), and I think we’ve gotten a step closer to making it feel more realistic. The gallery has some examples that people have been regenerating. https://ift.tt/dCgKqhI Here’s a demo: https://ift.tt/azymMrd Feel free to dm me on Twitter: https://twitter.com/sakofchit if you’d like to try out the TestFlight in the meantime Would love to know what y'all think!
Tuesday, 18 March 2025
Monday, 17 March 2025
Sunday, 16 March 2025
Saturday, 15 March 2025
Friday, 14 March 2025
New top story on Hacker News: Show HN: OCR Benchmark Focusing on Automation
Show HN: OCR Benchmark Focusing on Automation
3 by prats226 | 0 comments on Hacker News.
OCR/Document extraction field has seen lot of action recently with releases like Mixtral OCR, Andrew Ng's agentic document processing etc. Also there are several benchmarks for OCR, however all testing for something slightly different which make good comparison of models very hard. To give an example, some models like mixtral-ocr only try to convert a document to markdown format. You have to use another LLM on top of it to get the final result. Some VLM’s directly give structured information like key fields from documents like invoices, but you have to either add business rules on top of it or use some LLM as a judge kind of system to get sense of which output needs to be manually reviewed or can be taken as correct output. No benchmark attempts to measure the actual rate of automation you can achieve. We have tried to solve this problem with a benchmark that is only applicable for documents/usecases where you are looking for automation and its trying to measure that end to end automation level of different models or systems. We have collected a dataset that represents documents like invoices etc which are applicable in processes where automation is needed vs are more copilot in nature where you would need to chat with document. Also have annotated these documents and published the dataset and repo so it can be extended. Here is writeup: https://ift.tt/UmajqRL Dataset: https://ift.tt/el8PGBN Github: https://ift.tt/wH23Ir8 Looking for suggestions on how this benchmark can be improved further.
3 by prats226 | 0 comments on Hacker News.
OCR/Document extraction field has seen lot of action recently with releases like Mixtral OCR, Andrew Ng's agentic document processing etc. Also there are several benchmarks for OCR, however all testing for something slightly different which make good comparison of models very hard. To give an example, some models like mixtral-ocr only try to convert a document to markdown format. You have to use another LLM on top of it to get the final result. Some VLM’s directly give structured information like key fields from documents like invoices, but you have to either add business rules on top of it or use some LLM as a judge kind of system to get sense of which output needs to be manually reviewed or can be taken as correct output. No benchmark attempts to measure the actual rate of automation you can achieve. We have tried to solve this problem with a benchmark that is only applicable for documents/usecases where you are looking for automation and its trying to measure that end to end automation level of different models or systems. We have collected a dataset that represents documents like invoices etc which are applicable in processes where automation is needed vs are more copilot in nature where you would need to chat with document. Also have annotated these documents and published the dataset and repo so it can be extended. Here is writeup: https://ift.tt/UmajqRL Dataset: https://ift.tt/el8PGBN Github: https://ift.tt/wH23Ir8 Looking for suggestions on how this benchmark can be improved further.
Thursday, 13 March 2025
Wednesday, 12 March 2025
Tuesday, 11 March 2025
Monday, 10 March 2025
Sunday, 9 March 2025
Saturday, 8 March 2025
Friday, 7 March 2025
New top story on Hacker News: Kitchen foil and Algerian markets: When your phone is stolen in London
Kitchen foil and Algerian markets: When your phone is stolen in London
3 by BerislavLopac | 0 comments on Hacker News.
3 by BerislavLopac | 0 comments on Hacker News.
Thursday, 6 March 2025
Wednesday, 5 March 2025
Tuesday, 4 March 2025
New top story on Hacker News: Show HN: Open-source Deep Research across workplace applications
Show HN: Open-source Deep Research across workplace applications
13 by yuhongsun | 1 comments on Hacker News.
I’ve been using deep research on OpenAI and Perplexity and it’s been just amazing at gathering data across a lot of related and chained searches. Just earlier today, I asked “What are some marquee tech companies / hot startups (not including the giants like FAAMG, Samsung, Nvidia etc.)”. It’s a pretty involved question and looking up “marquee tech startups” or "hot tech startups" on Google gave me nothing useful. Deep research on both ChatGPT and Perplexity gave really high quality responses with ChatGPT siding on slightly larger scaleups and Perplexity siding more on up and coming companies. Given how useful AI research agents are across the internet, we decided to build an open-source equivalent for the workplace since a ton of questions at work also cannot be easily resolved with a single search. Onyx supports deep research connected to company applications like Google Drive, Salesforce, Sharepoint, GitHub, Slack, and 30+ others. For example, an engineer may want to know “What’s happening with the verification email failure?” Onyx’s AI agent would first figure out what it needs to answer this question: What is the cause of the failure, what has been done to address it, has this come up before, and what’s the latest status on the issue. The agent would run parallel searches through Confluence, email, Slack, and GitHub to get the answers to these then combine them to build a coherent overview. If the agent finds that there was a technical blocker that will delay the resolution, it will adjust mid-flight and research to get more context on the blocker. Here’s a video demo I recorded: https://www.youtube.com/watch?v=drvC0fWG4hE If you want to get started with the GitHub repo, you can check out our guides at https://docs.onyx.app . Or to play with it without needing to deploy anything, you can go to https://ift.tt/mIfqjgX P.S. There’s a lot of cool technical details behind building a system like this so I’ll continue the conversation in the comments.
13 by yuhongsun | 1 comments on Hacker News.
I’ve been using deep research on OpenAI and Perplexity and it’s been just amazing at gathering data across a lot of related and chained searches. Just earlier today, I asked “What are some marquee tech companies / hot startups (not including the giants like FAAMG, Samsung, Nvidia etc.)”. It’s a pretty involved question and looking up “marquee tech startups” or "hot tech startups" on Google gave me nothing useful. Deep research on both ChatGPT and Perplexity gave really high quality responses with ChatGPT siding on slightly larger scaleups and Perplexity siding more on up and coming companies. Given how useful AI research agents are across the internet, we decided to build an open-source equivalent for the workplace since a ton of questions at work also cannot be easily resolved with a single search. Onyx supports deep research connected to company applications like Google Drive, Salesforce, Sharepoint, GitHub, Slack, and 30+ others. For example, an engineer may want to know “What’s happening with the verification email failure?” Onyx’s AI agent would first figure out what it needs to answer this question: What is the cause of the failure, what has been done to address it, has this come up before, and what’s the latest status on the issue. The agent would run parallel searches through Confluence, email, Slack, and GitHub to get the answers to these then combine them to build a coherent overview. If the agent finds that there was a technical blocker that will delay the resolution, it will adjust mid-flight and research to get more context on the blocker. Here’s a video demo I recorded: https://www.youtube.com/watch?v=drvC0fWG4hE If you want to get started with the GitHub repo, you can check out our guides at https://docs.onyx.app . Or to play with it without needing to deploy anything, you can go to https://ift.tt/mIfqjgX P.S. There’s a lot of cool technical details behind building a system like this so I’ll continue the conversation in the comments.
Monday, 3 March 2025
Sunday, 2 March 2025
New top story on Hacker News: Speedrunners are vulnerability researchers, they just don't know it yet
Speedrunners are vulnerability researchers, they just don't know it yet
34 by chc4 | 12 comments on Hacker News.
34 by chc4 | 12 comments on Hacker News.
Saturday, 1 March 2025
New top story on Hacker News: Drone captures narwhals using their tusks to explore, forage and play
Drone captures narwhals using their tusks to explore, forage and play
10 by dnetesn | 0 comments on Hacker News.
10 by dnetesn | 0 comments on Hacker News.
New top story on Hacker News: Making o1, o3, and Sonnet 3.7 Hallucinate for Everyone
Making o1, o3, and Sonnet 3.7 Hallucinate for Everyone
16 by hahahacorn | 6 comments on Hacker News.
16 by hahahacorn | 6 comments on Hacker News.