Arabic Machine Learning

non-profit

https://github.com/ARBML

arabicml2

arbml

Activity Feed Request to join this org

AI & ML interests

Arabic NLP, computer vision, etc.

Recent Activity

BounharAbdelaziz authored a paper about 2 months ago

Nile-Chat: Egyptian Language Models for Arabic and Latin Scripts

DrMostafa authored a paper 2 months ago

Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training

SaiedAlshahrani updated a dataset 2 months ago

arbml/CIDAR

View all activity

Nymbo

posted an update 11 days ago

Post

621

I built a general use MCP space ~ Fetch webpages, DuckDuckGo search, Python code execution, Kokoro TTS, Image Gen, Video Gen.

# Tools

1. Fetch webpage
2. Web search via DuckDuckGo (very concise, low excess context)
3. Python code executor
4. Kokoro-82M speech generation
5. Image Generation (use any model from HF Inference Providers)
6. Video Generation (use any model from HF Inference Providers)

The first four tools can be used without any API keys whatsoever. DDG search is free and the code execution and speech gen is done on CPU. Having a HF_READ_TOKEN in the env variables will show all tools. If there isn't a key present, The Image/Video Gen tools are hidden.

Nymbo/Tools

Nymbo

posted an update 19 days ago

Post

895

Anyone using Jan-v1-4B for local MCP-based web search, I highly recommend you try out Intelligent-Internet/II-Search-4B

Very impressed with this lil guy and it deserves more downloads. It's based on the original version of Qwen3-4B but find that it questions reality way less often. Jan-v1 seems to think that everything it sees is synthetic data and constantly gaslights me

alielfilali01

posted an update 30 days ago

Post

517

Guys WTH is "yofo-*" ???
Most OpenAI staff associated with the openai/gpt-oss-68911959590a1634ba11c7a4 release are affiliated to dozens of yofo orgs ...

i.e

yofo-wildflower

Some HF folks as well 👀

BounharAbdelaziz

authored a paper about 2 months ago

Nile-Chat: Egyptian Language Models for Arabic and Latin Scripts

Paper • 2507.04569 • Published Jul 6 • 19

DrMostafa

authored a paper 2 months ago

Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training

Paper • 2506.21594 • Published Jun 18 • 7

SaiedAlshahrani

updated a dataset 2 months ago

arbml/CIDAR

Viewer • Updated Jul 1 • 10k • 77 • 51

SaiedAlshahrani

in arbml/CIDAR 2 months ago

Update README.md

#3 opened 2 months ago by

SaiedAlshahrani

Zaid

in arbml/CIDAR 2 months ago

Update README.md

#3 opened 2 months ago by

SaiedAlshahrani

Nymbo

posted an update 2 months ago

Post

2823

Anyone know how to reset Claude web's MCP config? I connected mine when the HF MCP first released with just the default example spaces added. I added lots of other MCP spaces but Claude.ai doesn't update the available tools... "Disconnecting" the HF integration does nothing, deleting it and adding it again does nothing.

Refreshing tools works fine in VS Code because I can manually restart it in mcp.json, but claude.ai has no such option. Anyone got any ideas?

4 replies

alielfilali01

authored a paper 3 months ago

Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi

Paper • 2504.06011 • Published Apr 8 • 2

Nymbo

posted an update 4 months ago

Post

4095

Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?

1 reply

Nymbo

posted an update 4 months ago

Post

2769

PSA for anyone using Nymbo/Nymbo_Theme or Nymbo/Nymbo_Theme_5 in a Gradio space ~

Both of these themes have been updated to fix some of the long-standing inconsistencies ever since the transition to Gradio v5. Textboxes are no longer bright green and in-line code is readable now! Both themes are now visually identical across versions.

If your space is already using one of these themes, you just need to restart your space to get the latest version. No code changes needed.

alielfilali01

posted an update 4 months ago

Post

829

Great efforts from @AtlasIA folks to adapt text2image models (ghibli style) for Moroccan Context

Read the blog is here : https://huggingface.co/blog/atlasia/creating-your-custom-ghibli-text-to-image-model

Zaid

updated a dataset 5 months ago

arbml/CIDAR

Viewer • Updated Jul 1 • 10k • 77 • 51

not-lain

posted an update 6 months ago

Post

4645

🚀AraClip is now fully integrated with Hugging Face 🤗

AraClip is a specialized CLIP model that was created by @pain and optimized for Arabic text-image retrieval tasks🔥

🔗 Try it out 🔗
🤖 model: Arabic-Clip/araclip
🧩 Gradio demo: Arabic-Clip/Araclip-Simplified
🌐 website: https://arabic-clip.github.io/Arabic-CLIP/

2 replies

alielfilali01

posted an update 7 months ago

Post

1080

🚨 Arabic LLM Evaluation 🚨

Few models join the ranking of https://huggingface.co/spaces/inceptionai/AraGen-Leaderboard Today.

The new MistralAI model, Saba, is quite impressive, Top10 ! Well done @arthurmensch and team.

Sadly Mistral did not follow its strategy about public weights this time, we hope this changes soon and we get the model with a permissive license.

We added other Mistral models and apparently, we have been sleeping on mistralai/Mistral-Large-Instruct-2411 !

Another impressive model that joined the ranking today is ALLaM-AI/ALLaM-7B-Instruct-preview. After a long wait finally ALLaM is here and it is IMPRESSIVE given its size !

ALLaM is ranked on OALL/Open-Arabic-LLM-Leaderboard as well.

not-lain

posted an update 7 months ago

Post

4537

I have just released a new blogpost about kv caching and its role in inference speedup 🚀
🔗 https://huggingface.co/blog/not-lain/kv-caching/
some takeaways :

4 replies

not-lain

posted an update 8 months ago

Post

1798

we now have more than 2000 public AI models using ModelHubMixin🤗

not-lain

posted an update 8 months ago

Post

4140

Published a new blogpost 📖
In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer.
🔗 https://huggingface.co/blog/not-lain/tensor-dims
some interesting takeaways :

alielfilali01

posted an update 8 months ago

Post

2155

3C3H AraGen Leaderboard welcomes today deepseek-ai/DeepSeek-V3 and 12 other models (including the late gpt-3.5 💀) to the ranking of best LLMs in Arabic !

Observations:
- DeepSeek-v3 ranked 3rd and only Open model among the top 5 !

- A 14B open model ( Qwen/Qwen2.5-14B-Instruct) outperforms gpt-3.5-turbo-0125 (from last year). This shows how much we came in advancing and supporting Arabic presence within the LLM ecosystem !

- Contrary to what observed in likelihood-acc leaderboards (like OALL/Open-Arabic-LLM-Leaderboard) further finetuned models like maldv/Qwentile2.5-32B-Instruct actually decreased the performance compared to the original model Qwen/Qwen2.5-32B-Instruct.
It's worth to note that the decrease is statiscally insignificant which imply that at best, the out-domain finetuning do not really hurts the model original capabilities acquired during pretraining.
Previous work addressed this (finetuning VS pretraining) but more investigation in this regard is required (any PhDs here ? This could be your question ...)

Check out the latest rankings: https://huggingface.co/spaces/inceptionai/AraGen-Leaderboard

AI & ML interests

Recent Activity

Team members 301

arbml's activity

Update README.md

Update README.md