Spaces:
Running
Running
File size: 3,446 Bytes
18386c7 be9d670 eaceac4 18386c7 eaceac4 be9d670 f219f0a be9d670 f219f0a be9d670 f219f0a be9d670 f219f0a be9d670 f219f0a be9d670 f219f0a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
---
title: Research Tracker MCP
emoji: 🔬
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 5.38.2
app_file: app.py
pinned: false
---
# Research Tracker MCP Server
A robust, performant MCP (Model Context Protocol) server that provides research inference utilities following MCP best practices. This server extracts research metadata from paper URLs, repository links, or research names using intelligent scraping and API integration.
## Features
- **Author inference** from papers and repositories
- **Cross-platform resource discovery** (papers, code, models, datasets)
- **Research metadata extraction** (names, dates, licenses)
- **URL classification** and relationship mapping
- **Comprehensive research ecosystem analysis**
- **Rate limiting** to prevent API abuse
- **Request caching** with TTL for performance
- **Robust error handling** with typed exceptions
- **Security validation** for all URLs
- **Retry logic** with exponential backoff
## Available MCP Tools
All functions are optimized for MCP usage with clear type hints and docstrings:
- `infer_authors` - Extract author names from papers and repositories
- `infer_paper_url` - Find associated research paper URLs
- `infer_code_repository` - Discover code repository links
- `infer_research_name` - Extract research project names
- `classify_research_url` - Classify URL types (paper/code/model/etc.)
- `infer_organizations` - Identify affiliated organizations
- `infer_publication_date` - Extract publication dates
- `infer_model` - Find associated HuggingFace models
- `infer_dataset` - Find associated HuggingFace datasets
- `infer_space` - Find associated HuggingFace spaces
- `infer_license` - Extract license information
- `find_research_relationships` - Comprehensive research ecosystem analysis
## Input Support
- arXiv paper URLs (https://arxiv.org/abs/...)
- HuggingFace paper URLs (https://huggingface.co/papers/...) - **Preferred over arXiv for better resource discovery**
- GitHub repository URLs (https://github.com/...)
- HuggingFace model/dataset/space URLs
- Research paper titles and project names
- Project page URLs (github.io)
## MCP Best Practices Implementation
This server follows official MCP best practices:
1. **Security**: URL validation, domain allowlisting, input sanitization
2. **Performance**: Request caching, rate limiting, connection pooling
3. **Reliability**: Retry logic, graceful error handling, timeout management
4. **Documentation**: Comprehensive docstrings with examples for all tools
5. **Error Handling**: Typed exceptions for different failure scenarios
## Environment Variables
- `HF_TOKEN` - Hugging Face API token (required)
- `GITHUB_AUTH` - GitHub API token (optional, enables enhanced GitHub integration)
## Usage
The server automatically launches as an MCP server when run. All inference functions are exposed as MCP tools for seamless integration with Claude and other AI assistants.
### Example
Test with the 3D Arena paper:
```
Input: https://arxiv.org/abs/2506.18787
Finds: dataset (dylanebert/iso3d), space (dylanebert/LGM-tiny), and more
```
### Rate Limits
- 30 requests per minute per tool
- Automatic caching reduces duplicate requests
- Graceful error messages when limits exceeded
### Error Handling
The server provides clear error messages:
- `ValidationError`: Invalid or malicious URLs
- `ExternalAPIError`: External service failures
- `MCPError`: Rate limiting or other MCP issues |