32 private links
The success of Large Language Models (LLMs) has sparked interest in various agentic applications. A key hypothesis is that LLMs, leveraging common sense and Chain-of-Thought (CoT) reasoning, can effectively explore and efficiently solve complex domains. However, LLM agents have been found to suffer from sub-optimal exploration and the knowing-doing gap, the inability to effectively act on knowledge present in the model. In this work, we systematically study why LLMs perform sub-optimally in decision-making scenarios. In particular, we closely examine three prevalent failure modes: greediness, frequency bias, and the knowing-doing gap. We propose mitigation of these shortcomings by fine-tuning via Reinforcement Learning (RL) on self-generated CoT rationales. Our experiments across multi-armed bandits, contextual bandits, and Tic-tac-toe, demonstrate that RL fine-tuning enhances the decision-making abilities of LLMs by increasing exploration and narrowing the knowing-doing gap. Finally, we study both classic exploration mechanisms, such as epsilon-greedy, and LLM-specific approaches, such as self-correction and self-consistency, to enable more effective fine-tuning of LLMs for decision-making.
This paper revisits and extends the results presented in 2005 by Wilcox and Crittenden in a white paper titled Does Trend Following Work on Stocks? Leveraging a
The design of financial instruments and processes has always been contingent on the infrastructure such products live on. Blockchain technology, as utilized by
LLM2CLIP: Research showing how to improve CLIP's image-text matching abilities by replacing its text encoder with a frozen LLM (like Llama) and a trainable adapter. The key innovation is fine-tuning the LLM first to make its outputs more discriminative, then using it to help CLIP's vision encoder better understand language. Results show major improvements in matching detailed descriptions to images, handling long text, and even working across languages, while requiring relatively little training time and compute.
"Systematically identifying clusters of similar assets is a critical step in statistical arbitrage strategies... Profitability is influenced more by the selection of feature sets and clustering methods than by the choice of signals."
We explore tree-based macroeconomic regime-switching in the context of the dynamic Nelson-Siegel (DNS) yield-curve model.
Generating step-by-step "chain-of-thought" rationales improves language model performance on complex reasoning tasks like mathematics or commonsense question-answering.
hybrid LSTM models, significantly outperform the traditional GARCH models
Liquid alternative strategies, specifically trend-following and long/short quality stocks, could be viewed as the new bonds.
We introduce a conditional machine learning approach to forecast the stock index return. Our approach is designed to work well for short-horizon forecasts to ad
Momentum trading strategies are thoroughly described in the academic literature and used in many trading strategies by hedge funds, asset managers, and propriet
This paper presents a novel approach to identifying potential bubbles in the US stock market by employing alternative time series methods based on long memory,
Motivated by recent advances in large language models for Natural Language Processing (NLP), we design a time-series foundation model for forecasting whose out-of-the-box zero-shot performance on a variety of public datasets comes close to the accuracy of state-of-the-art supervised forecasting models for each individual dataset. Our model is based on pretraining a patched-decoder style attention model on a large time-series corpus, and can work well across different forecasting history lengths, prediction lengths and temporal granularities.