Podcast
Questions and Answers
What are the characteristics of no-arbitrage in asset pricing theory?
What are the characteristics of no-arbitrage in asset pricing theory?
Which issue presents a challenge to machine learning theory in finance?
Which issue presents a challenge to machine learning theory in finance?
What is the first step in building signals for a portfolio using machine learning?
What is the first step in building signals for a portfolio using machine learning?
How can machine learning be applied to process data in finance?
How can machine learning be applied to process data in finance?
Signup and view all the answers
What is the purpose of using a validation sample in machine learning?
What is the purpose of using a validation sample in machine learning?
Signup and view all the answers
Why is it questioned whether machine learning performance is spurious in finance?
Why is it questioned whether machine learning performance is spurious in finance?
Signup and view all the answers
What does the process of building a portfolio entail after creating signals?
What does the process of building a portfolio entail after creating signals?
Signup and view all the answers
What is an important aspect often expected from students regarding code in the exam?
What is an important aspect often expected from students regarding code in the exam?
Signup and view all the answers
Which method for building portfolio weights involves buying and selling stocks proportionally to their market capitalization?
Which method for building portfolio weights involves buying and selling stocks proportionally to their market capitalization?
Signup and view all the answers
What is the primary goal when measuring the performance of a machine learning portfolio?
What is the primary goal when measuring the performance of a machine learning portfolio?
Signup and view all the answers
In constructing a long-short portfolio, what is the expected relationship between decile of signal and mean return?
In constructing a long-short portfolio, what is the expected relationship between decile of signal and mean return?
Signup and view all the answers
What statistical measure is primarily used to evaluate the risk-adjusted performance of a portfolio?
What statistical measure is primarily used to evaluate the risk-adjusted performance of a portfolio?
Signup and view all the answers
What is indicated by a statistically significant and positive alpha in the context of portfolio performance?
What is indicated by a statistically significant and positive alpha in the context of portfolio performance?
Signup and view all the answers
Why is the Value Weighted (VW) strategy considered to have lower transaction costs compared to the Equally Weighted (EW) strategy?
Why is the Value Weighted (VW) strategy considered to have lower transaction costs compared to the Equally Weighted (EW) strategy?
Signup and view all the answers
Which of the following factors is NOT typically included in performance benchmarking of a portfolio?
Which of the following factors is NOT typically included in performance benchmarking of a portfolio?
Signup and view all the answers
What is a key drawback of using out-of-sample signals directly as weights in portfolio construction?
What is a key drawback of using out-of-sample signals directly as weights in portfolio construction?
Signup and view all the answers
What is a classical neural network commonly used for?
What is a classical neural network commonly used for?
Signup and view all the answers
According to the information, LLMs perform poorly in predicting outcomes for which type of firms?
According to the information, LLMs perform poorly in predicting outcomes for which type of firms?
Signup and view all the answers
What is emphasized as crucial to be fully prepared for the exam?
What is emphasized as crucial to be fully prepared for the exam?
Signup and view all the answers
What is stated regarding the writing of code during the exam?
What is stated regarding the writing of code during the exam?
Signup and view all the answers
What should students focus on in addition to understanding content for the exam?
What should students focus on in addition to understanding content for the exam?
Signup and view all the answers
What effect does increasing the penalty have in the Penalized Markowitz model?
What effect does increasing the penalty have in the Penalized Markowitz model?
Signup and view all the answers
What is a significant advantage of using Penalized Markowitz compared to traditional methods?
What is a significant advantage of using Penalized Markowitz compared to traditional methods?
Signup and view all the answers
What is the purpose of tokenizing text in the context of LLMs?
What is the purpose of tokenizing text in the context of LLMs?
Signup and view all the answers
Why does positional encoding need to use a trigonometric function in LLMs?
Why does positional encoding need to use a trigonometric function in LLMs?
Signup and view all the answers
Which of the following models is considered traditional before the advent of LLMs in finance?
Which of the following models is considered traditional before the advent of LLMs in finance?
Signup and view all the answers
What is the primary function of attention heads in LLMs?
What is the primary function of attention heads in LLMs?
Signup and view all the answers
Which technique is not associated with textual analysis in finance before LLMs?
Which technique is not associated with textual analysis in finance before LLMs?
Signup and view all the answers
What is a key limitation of traditional models such as Bag-of-words and TF-IDF in finance?
What is a key limitation of traditional models such as Bag-of-words and TF-IDF in finance?
Signup and view all the answers
What is the main function of the attention heads in a modern LLM?
What is the main function of the attention heads in a modern LLM?
Signup and view all the answers
What does the latent representation produced by the attention heads represent?
What does the latent representation produced by the attention heads represent?
Signup and view all the answers
How does the classical neural network interact with the output of the attention heads?
How does the classical neural network interact with the output of the attention heads?
Signup and view all the answers
What is one method used during inference to limit the randomness of token generation?
What is one method used during inference to limit the randomness of token generation?
Signup and view all the answers
What proportion of the training set consists of general knowledge according to the provided information?
What proportion of the training set consists of general knowledge according to the provided information?
Signup and view all the answers
How is the training sample for the model created?
How is the training sample for the model created?
Signup and view all the answers
What mathematical operation is applied to calculate attention values among tokens?
What mathematical operation is applied to calculate attention values among tokens?
Signup and view all the answers
What is a characteristic of the tensor produced by aggregating outputs of attention heads?
What is a characteristic of the tensor produced by aggregating outputs of attention heads?
Signup and view all the answers
Study Notes
Asset Pricing Theory vs Machine Learning
- Asset pricing theory states that returns of stocks can be explained by a factor structure with only a few (5-10 max) factors explaining returns.
- Machine learning shows high performance on paper, but its large number of parameters makes it difficult to reconcile with asset pricing theory.
- Machine learning's high performance may be spurious and researchers may have missed something in their tests.
- ML is used in finance for processing data that can't be processed without (text, image etc.)
- ML is used to process traditional data better than older models.
Using ML to make a portfolio
- Split the sample into training, validation, and test sets.
- Train a set of models with different hyperparameters on the training sample.
- Use the validation sample to measure performance and find optimal hyperparameters.
- Keep the forecast of the best model on the test sample.
- Repeat steps 1-4 and concatenate the results to get a set of out-of-sample forecasts (signals).
- Use out-of-sample signals to build portfolio weights using strategies like long-short decile or directly using the signals.
Measuring Performance of ML Portfolio
- Basic performance metric is the Sharpe ratio (annualized mean/std of return).
- Check for high reward (mean return) and low risk (variance or std of return).
- Analyze performance with a table split by decile or long vs short, showing mean return, std of return, and Sharpe ratio.
- Ensure mean return tends to increase with decile, while std remains stable.
- Look for a high Sharpe ratio in the long-short portfolio.
- Check portfolio performance against benchmarks like the market portfolio and Fama-French factors (HmLt and SmBt).
- Regress returns against benchmarks to check for a significant and positive alpha, indicating performance not explained by known strategies.
Penalized Markowitz
- Penalized Markowitz extends Markowitz portfolio optimization by incorporating a penalty term similar to Ridge regression.
- The penalty diminishes the model's willingness to select high absolute weights, resulting in lower weights or even zero weights for large penalties.
- It is better when dealing with small samples or lots of stocks.
Textual Analysis—The Old Ways (Before LLMs)
- Bag-of-words models are used to count words associated with meanings.
- TF-IDF converts text into vectors based on term frequency and inverse document frequency.
- Simple machine learning models like Word2Vec and Bert are used for text analysis despite being outperformed by LLMs in most financial tasks.
LLMs Demystification
-
Model Structure:
- LLMs process the input text by tokenizing it, adding positional encoding, and passing it through attention heads.
- The output of the attention heads is then fed into a classical neural network to predict the next tokens.
-
Tokenization:
- Tokens are bits of text often representing words or parts of words.
- Tokens are represented as sets of numbers with an encoding scheme for optimal meaning representation and processing.
-
Positional Encoding:
- LLMs use trigonometric functions to encode the position of each token. This is essential for the order dependence in understanding language.
-
Attention Heads:
- Each attention head analyzes the importance of one token relative to another.
- Modern LLMs have multiple attention heads to enhance analysis and modeling of token relationships.
-
Predicting Next Token:
- The processed latent representation, a complex tensor representing the model's view of the text, is input to a classical neural network that predicts the probability of the next token.
-
Training:
- All parameters of the LLM are trained jointly on a massive dataset to predict the next word.
- The training process involves hiding the previous token and using the context to predict it.
-
Inference:
- LLMs generate text by autoregressive prompting, generating one token at a time and updating the context.
- The token prediction probability distribution is used to select the most likely tokens, often limited to the top-K or top-p nucleus for filtering rare events.
Other Stuff to Prepare
- Understand all figures and tables discussed in class.
- Make sure you understand everything summarized above.
- Understand code in tutorials and exams. You will not need to write code but should be able to explain what you've seen before.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Explore the intersection of asset pricing theory and machine learning in finance. This quiz discusses the challenges of reconciling traditional asset pricing factors with the complexity of machine learning models and examines how ML can enhance portfolio management. Test your knowledge on these key concepts and their applications in financial data processing.