Search: [paper]

Paper page - LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

The success of Large Language Models (LLMs) has sparked interest in various agentic applications. A key hypothesis is that LLMs, leveraging common sense and Chain-of-Thought (CoT) reasoning, can effectively explore and efficiently solve complex domains. However, LLM agents have been found to suffer from sub-optimal exploration and the knowing-doing gap, the inability to effectively act on knowledge present in the model. In this work, we systematically study why LLMs perform sub-optimally in decision-making scenarios. In particular, we closely examine three prevalent failure modes: greediness, frequency bias, and the knowing-doing gap. We propose mitigation of these shortcomings by fine-tuning via Reinforcement Learning (RL) on self-generated CoT rationales. Our experiments across multi-armed bandits, contextual bandits, and Tic-tac-toe, demonstrate that RL fine-tuning enhances the decision-making abilities of LLMs by increasing exploration and narrowing the knowing-doing gap. Finally, we study both classic exploration mechanisms, such as epsilon-greedy, and LLM-specific approaches, such as self-correction and self-consistency, to enable more effective fine-tuning of LLMs for decision-making.

paper · llm

April 27, 2025 at 4:29:07 PM EDT * · permalink

·

https://huggingface.co/papers/2504.16078

Does Trend-Following Still Work on Stocks? by Carlo Zarattini, Alberto Pagani, Cole Wilcox :: SSRN

This paper revisits and extends the results presented in 2005 by Wilcox and Crittenden in a white paper titled Does Trend Following Work on Stocks? Leveraging a

paper

April 2, 2025 at 6:05:52 PM EDT * · permalink

·

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5084316

Public crypto networks as financial market infrastructures by Ulrich Bindseil, Omid Malekan :: SSRN

The design of financial instruments and processes has always been contingent on the infrastructure such products live on. Blockchain technology, as utilized by

paper · blockchain · defi

February 25, 2025 at 2:25:31 AM EST * · permalink

·

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5138052

in-context learning

paper · ml

December 30, 2024 at 7:24:07 PM EST * · permalink

·

https://x.com/DimitrisPapail/status/1873503233378820257

[2411.04997] LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation

LLM2CLIP: Research showing how to improve CLIP's image-text matching abilities by replacing its text encoder with a frozen LLM (like Llama) and a trainable adapter. The key innovation is fine-tuning the LLM first to make its outputs more discriminative, then using it to help CLIP's vision encoder better understand language. Results show major improvements in matching detailed descriptions to images, handling long text, and even working across languages, while requiring relatively little training time and compute.

ml · paper

November 14, 2024 at 3:28:51 PM EST * · permalink

·

https://arxiv.org/abs/2411.04997

Statistical Arbitrage via Single-View and Multi-View Spectral Clustering on Mixed Frequency Data

"Systematically identifying clusters of similar assets is a critical step in statistical arbitrage strategies... Profitability is influenced more by the selection of feature sets and clustering methods than by the choice of signals."

paper · finance

November 11, 2024 at 11:09:33 AM EST * · permalink

·

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4975855

[2410.01792] When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1

ml · paper

October 24, 2024 at 1:30:44 PM EDT * · permalink

·

https://arxiv.org/abs/2410.01792

[2410.01201] Were RNNs All We Needed?

ml · paper

October 24, 2024 at 1:30:37 PM EDT * · permalink

·

https://arxiv.org/abs/2410.01201

Machine Learning and the Yield Curve: Tree-Based Macroeconomic Regime Switching by Siyu Bie, Francis X. Diebold, Jingyu He, Junye Li :: SSRN

We explore tree-based macroeconomic regime-switching in the context of the dynamic Nelson-Siegel (DNS) yield-curve model.

finance · ml · paper

September 26, 2024 at 11:32:06 AM EDT * · permalink

·

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4934442

[2203.14465] STaR: Bootstrapping Reasoning With Reasoning

Generating step-by-step "chain-of-thought" rationales improves language model performance on complex reasoning tasks like mathematics or commonsense question-answering.

ml · llm · paper

September 14, 2024 at 7:20:47 PM EDT * · permalink

·

https://arxiv.org/abs/2203.14465

THE HYBRID FORECAST OF S&P 500 VOLATILITY ENSEMBLED FROM VIX, GARCH AND LSTM MODELS

hybrid LSTM models, significantly outperform the traditional GARCH models

finance · paper · llm

September 9, 2024 at 10:20:59 AM EDT * · permalink

·

https://www.wne.uw.edu.pl/application/files/4417/1949/0286/WNE_WP449.pdf

[2409.01666] In Defense of RAG in the Era of Long-Context Language Models

llm · ai · paper

September 4, 2024 at 10:27:45 PM EDT * · permalink

·

https://arxiv.org/abs/2409.01666

[2407.13931] Who Wins Ethereum Block Building Auctions and Why?

blockchain · paper

July 23, 2024 at 8:25:48 AM EDT * · permalink

·

https://arxiv.org/abs/2407.13931

The Yieldy Put: To Infinity and Beyond Bonds | Man Institute | Man Group

Liquid alternative strategies, specifically trend-following and long/short quality stocks, could be viewed as the new bonds.

paper · finance

July 23, 2024 at 12:49:48 AM EDT * · permalink

·

https://www.man.com/maninstitute/yieldy-put-infinity-and-beyond-bonds

Robust Stock Index Return Predictions Using Deep Learning * by Ravi Jagannathan, Yuan Liao, Andreas Neuhierl :: SSRN

We introduce a conditional machine learning approach to forecast the stock index return. Our approach is designed to work well for short-horizon forecasts to ad

finance · paper

July 16, 2024 at 6:14:17 PM EDT * · permalink

·

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4890466

Momentum and Trend Following Trading Strategies for Currencies Revisited - Combining Academia and Industry by Janick Rohrbach, Silvan Suremann, Joerg Osterrieder :: SSRN

Momentum trading strategies are thoroughly described in the academic literature and used in many trading strategies by hedge funds, asset managers, and propriet

trade · paper

June 13, 2024 at 5:01:32 PM EDT * · permalink

·

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2949379

Stock Market Prices and Dividends in the US: Bubbles or Long-Run Equilibria Relationships? by OlaOluwa S. Yaya, Luis A. Gil-Alana, Robinson Dettoni :: SSRN

This paper presents a novel approach to identifying potential bubbles in the US stock market by employing alternative time series methods based on long memory,

timeseries · paper

May 9, 2024 at 9:18:32 AM EDT * · permalink

·

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4794169

[2310.10688] A decoder-only foundation model for time-series forecasting

Motivated by recent advances in large language models for Natural Language Processing (NLP), we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on a variety of public datasets comes close to the accuracy of state-of-the-art supervised forecasting models for each individual dataset. Our model is based on pretraining a patched-decoder style attention model on a large time-series corpus, and can work well across different forecasting history lengths, prediction lengths and temporal granularities.

finance · ml · paper

February 23, 2024 at 8:33:36 PM EST * · permalink

·

https://arxiv.org/abs/2310.10688

FinBen: An Holistic Financial Benchmark for Large Language Models

https://huggingface.co/papers/2402.12659

finance · ml · paper

February 23, 2024 at 8:15:53 PM EST * · permalink

·

https://arxiv.org/abs/2402.12659

DYNAMIC TRANSACTION FEE MECHANISM DESIGN

blockchain · economics · paper · eip1559

February 15, 2024 at 9:12:29 PM EST * · permalink

·

https://www.mechanism.org/spec/04