genbench (GenBench)

koustuvs

authored a paper 3 months ago

V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

Paper • 2506.09985 • Published Jun 11 • 30

kazemnejad

authored 2 papers 5 months ago

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Paper • 2504.08942 • Published Apr 11 • 27

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2 • 87

yanaiela

authored a paper 5 months ago

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published Apr 9 • 77

koustuvs

authored a paper 5 months ago

Scaling Language-Free Visual Representation Learning

Paper • 2504.01017 • Published Apr 1 • 33

koustuvs

authored 5 papers 7 months ago

How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts

Paper • 2205.10762 • Published May 22, 2022

MetaMorph: Multimodal Understanding and Generation via Instruction Tuning

Paper • 2412.14164 • Published Dec 18, 2024 • 4

yanaiela

authored a paper 11 months ago

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

Paper • 2410.19133 • Published Oct 24, 2024 • 11

kazemnejad

authored 3 papers 11 months ago

The Impact of Positional Encoding on Length Generalization in Transformers

Paper • 2305.19466 • Published May 31, 2023 • 2

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Paper • 2410.01679 • Published Oct 2, 2024 • 27

Measuring the Knowledge Acquisition-Utilization Gap in Pretrained Language Models

Paper • 2305.14775 • Published May 24, 2023

yanaiela

authored 6 papers about 1 year ago

Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection

Paper • 2004.07667 • Published Apr 16, 2020

Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation

Paper • 2305.16938 • Published May 26, 2023

Text-based NP Enrichment

Paper • 2109.12085 • Published Sep 24, 2021

A Survey on Data Selection for Language Models

Paper • 2402.16827 • Published Feb 26, 2024 • 4

Lexical Generalization Improves with Larger Models and Longer Training

Paper • 2210.12673 • Published Oct 23, 2022

Data Contamination Report from the 2024 CONDA Shared Task

Paper • 2407.21530 • Published Jul 31, 2024 • 10

AI & ML interests

Team members 5

genbench's activity