EXCEL – FUNCTIONS AND FORMULAS
How to Enter Formulas in Excel: Quotes, Ampersands, and More
You always type the "=" character to begin entering a function or formula in Excel.
If you enter a simple formula such as =A1+B1, you can look at it and tell that it adds together the numbers in cells A1 and B1.
It becomes more complicated when you start using quotes (" " or ' '), ampersands (&), and other special characters in your formulas.
The ampersand (&) character joins together numbers, text, cell references, and the output of formulas.
You often use the & character with double quotes (" ") to join text with cells or functions. For example:
="First order was sold by Sales Rep "&D5
If D5 contains the sales rep's number, such as 1, this will display: "First order was sold by Sales Rep 1"
If we did not use the &, this formula would generate an error message because Excel would not be able to tell where the text ends and where the cell reference begins.
The double quotes tell Excel to interpret the first part as text rather than a formula, cell, or built-in function.
Another example is this formula:
="The total order amount is "&SUM(B5:B20)
Here, we are joining together text with the output of the SUM function, so the & is required.
If you enter just normal numbers as the inputs to a function, you do not need double quotes. For example:
=SUM(5, 4, 3, 2, 1)
Those are all normal numbers with no other operators or special characters, so the formula works.
But if you use something more than a simple number, you need to put double quotes around it:
=SUMIF(B5:B20,">200000",B5:B20)
If you do not use double quotes here, you'll also get an error message because Excel won't be able to interpret the >200000 part.
If you mix these types of operators (<, >, <>, <=, >=, etc.) with cell references, you also need to use an ampersand:
=SUMIF(B5:B20,">"&D2,B5:B20)
Here are a few potential function inputs to illustrate the idea behind the ampersand:
Potential Function Input: Why the & Is There or Not There:
Potential Function Input | Why the & Is There or Not There |
---|---|
"<5" | You don't need & because there are no cell references or functions. |
">"&D2 | You need & because there is a cell reference. |
">"&MAX(Summary!B8:B20) | You need & because there is a MAX function. |
=SUM(Orders_2!B3:B18)
=SUM('Orders 2'!B3:B18)
You need single quotes in the second one because the space counts as a "special character."
Single quotes make formulas harder to read and modify, so you should avoid spreadsheet names with special characters. Instead of using spaces to separate words in spreadsheet names, use the underscore character (_) since that does NOT count as a special character and does NOT require single quotes.
Your spreadsheet names should consist of letters, numbers, underscores, and nothing else.
Copying and Pasting Formulas
When you want to copy and paste formulas elsewhere, useful shortcuts include:
Shortcut | Command Name | Purpose |
---|---|---|
Ctrl + C (PC) or ⌘ + C (Mac) | Copy | Standard copy command. |
Ctrl + D | Copy Down | Copies a formula or constant down a range of cells, including all formatting and borders. |
Ctrl + R | Copy Right | Copies a formula or constant right across a range of cells, including all formatting and borders. |
Ctrl + ' | Copy from Above | Copies a formula or constant down from the cell directly above; no formatting or borders. |
Alt, E, S, T (PC) or Ctrl + ⌘ + V, T (Mac) | Paste Formats | Pastes just number formats, fills, font colors, and borders, but no formulas. |
Alt, E, S, V (PC) or Ctrl + ⌘ + V, V (Mac) | Paste Values | Pastes just the output of formulas – not the actual formulas – and ignores all formatting. |
Alt, E, S, F (PC) or Ctrl + ⌘ + V, F (Mac) | Paste Formulas | Pastes formulas, but not borders, fills, or font colors. |
Alt, E, S, R (PC) or Ctrl + ⌘ + V, R (Mac) | Paste Formulas and Number Formats | Pastes formulas and number formats but NOT borders, fills, or font colors. |
Alt, E, S, L (PC) or Ctrl + ⌘ + V, L (Mac) | Paste Links | Creates direct links to copied cells; useful for setting up headers. |
Financial Modeling Functions
"Financial modeling functions" are ones that involve a calculation (such as a sum or product), a logical check, (such as an IF statement with an AND), or a data lookup (such as finding a number in a range). Many of these functions are used in data analysis as well.
Logical and Arithmetic Functions
Key logical functions and operators include:
Function or Operator | Purpose |
---|---|
=IF(Condition, Value If True, Value If False) | Checks whether a condition is met and returns one value if it's TRUE and another value if it's FALSE. |
=IFS(Condition 1, Value If True, Condition 2, Value If True, …) | Checks separate conditions sequentially and returns the value for the first one that is TRUE. |
=OR(Condition 1, Condition 2, …) | Returns TRUE if any condition is true and FALSE if all conditions are false. |
=AND(Condition 1, Condition 2, …) | Returns TRUE if all the conditions are true. |
=CHOOSE(Index Number, Value 1, Value 2, Value 3, …) | Chooses a value from a list based on an index number. |
> | Returns TRUE if the first number or expression is greater than the second one. |
>= | Returns TRUE if the first number or expression is greater than or equal to the second one. |
= | Returns TRUE if the first number or expression equals the second one. |
<> | Returns TRUE if the first number or expression does not equal the second one. |
< | Returns TRUE if the first number or expression is less than the second one. |
<= | Returns TRUE if the first number or expression is less than or equal to the second one. |
Here are a few examples of how to use these functions and operators:
=IF(A1=5,"Valid", "Invalid")
=IF(OR(A1=5, B2=6), "Valid", "Invalid")
=IF(AND(A1>5,B2<=6), "Valid", "Invalid")
You often use these operators to error-check the output of financial models, as in this example:
=IF(OR(+G$60/G6<0,G$60/G6>=100),"NM",G$60/G6))
This formula divide cell G60 by G6, and if it's negative or greater than or equal to 100, it displays "NM" for "Not Meaningful"; otherwise, it displays the result of the calculation.
If you see a complex formula that uses IF, AND, OR, and the other operators above, break it into its components and put each one in a separate row to understand the entire formula.
Next, there are the arithmetic functions:
Arithmetic Function or Shortcut | Purpose |
---|---|
Alt + = (PC) or ⌘ + Shift + T (Mac) | Auto-sums adjacent cells. |
F9 | Refresh spreadsheet / re-calculate all formulas. |
=SUM(Number 1, Number 2, …) | Adds all numbers in a range of cells. |
=SUMIF(Range, Condition, Sum Range) | Adds all numbers in a range of cells that satisfy a single condition. |
=SUMIFS(Sum Range, Condition Range 1, Condition 1, …) | Adds all numbers in a range of cells that satisfy multiple conditions. |
=SUMPRODUCT(Array 1, Array 2, …) | Multiplies corresponding arrays or ranges and adds up their products. |
=COUNT(Value 1, Value 2, …) | Counts # cells in a range that contain numbers. |
=COUNTA(Value 1, Value 2, …) | Counts # cells in a range that are not blank. |
=COUNTIF(Range, Condition) | Counts # cells in a range that satisfy a single condition. |
=COUNTIFS(Condition Range 1, Condition 1, Condition Range 2, Condition 2, …) | Counts # cells in a range that satisfy multiple conditions. |
=AVERAGE(Number 1, Number 2, …) | Calculates the arithmetic mean of the inputs (cells, named cells, ranges, or arrays). |
=AVERAGEA(Number 1, Number 2, …) | Same, but excludes text and anything that is "False" in the inputs. |
=AVERAGEIF(Condition Range, Condition, Average Range) | Same, but checks for a certain condition to be satisfied before averaging the numbers in the range. |
=AVERAGEIFS(Average Range, Condition Range 1, Condition 1, …) | Same, but checks for multiple conditions to be satisfied before averaging the numbers in the range. |
=MEDIAN(Number 1, Number 2, …) | Returns the number in the middle when the numbers in a range are sorted. |
=MODE(Number 1, Number 2, …) | Returns the most frequent number in a set of numbers. |
=MAX(Number 1, Number 2, …) | Returns the largest value in a set of numbers, ignoring text and logical conditions. |
=MIN(Number 1, Number 2, …) | Returns the smallest value in a set of numbers, ignoring text and logical conditions. |
=QUARTILE(Array, Quartile # from 0 to 4) | Returns the MIN, 25th percentile, 50th percentile, 75th percentile, or MAX of a set of numbers. |
=ABS(Number) | Returns the absolute value of a number (i.e., it turns negatives into positives, and keeps positives the same). |
These functions come up in financial models and data analyses all the time – here are a few examples:
Financial Functions
There are hundreds of "financial functions" in Excel, but the most common ones in the models taught on this site are the following:
Financial Function | Purpose and Notes |
---|---|
=IRR(Series of Values) | Calculates the internal rate of return of a series of cash flows (roughly, the "effective compounded interest rate," with cash flows received between entry and exit subtracted from the running investment balance). Must have a negative for the first entry and then positives after that. Must have a hard-coded 0 if there are no cash flows in a specific period. |
=XIRR(Series of Values, Series of Dates) | Same as above, but this one works for non-annual dates with irregular periods in between. |
=NPV(Rate, Series of Values) | Calculates the Present Value of future cash flows (NOT the Net Present Value, despite the name!). Must have a hard-coded 0 if there are no cash flows in a specific period. The "Rate" is the Discount Rate, or the annualized return you could earn on other, similar investments. |
=XNPV(Rate, Series of Values, Series of Dates) | Same as above, but this one works for non-annual dates with irregular periods in between. |
=MIRR(Series of Values, Finance Rate, Reinvestment Rate) | Calculates IRR, but lets you assume re-investment of positive cash flows at a different rate, such as 5% or 10%, and lets you discount negative cash flows to Present Value at a different rate. This function better reflects reality because you may not be able to re-invest the positive cash flows at the same rate as the overall IRR. |
=PPMT(Interest Rate, Period #, Total # of Periods, Loan Principal) | Calculates principal payment on debt based on a constant interest rate, maturity, debt amount, and amortization period. Function ensures that (Interest + Principal Payment) is constant in each period. |
=IPMT(Interest Rate, Period #, Total # of Periods, Loan Principal) | Same, but for interest payments instead. |
=FV(Rate, Total # of Periods, Initial Amount) | Grows an initial amount by a certain rate over a # of periods and sums up each amount over that time (e.g., if $1,000 grows by 3% per year, $1,000 + $1,030 + $1,061 +…). |
=YIELD(Settlement Date, Maturity Date, Coupon Rate, Bond Purchase Price Out of 100, Amount Received Back at End Out of 100, 1 or 2 or 4 for Annual or Semiannual or Quarterly Interest) | Calculates the "yield" on a security (bond) that pays periodic, fixed interest. The "yield" is the annualized return if you purchase a security at a certain price, earn cash flows from it, and eventually receive back some or all of the principal upon maturity. |
=PRICE(Settlement Date, Maturity Date, Coupon Rate, Prevailing Yields on Similar Bonds, Amount Received Back at End Out of 100, 1 or 2 or 4 for Annual or Semiannual or Quarterly Interest) | Calculates the "price" of a security (bond) that pays periodic, fixed interest. The "price" is the Present Value, or what a security should be worth if it has a certain cash-flow profile and promises to pay back some or all of your principal upon maturity. |
=DURATION(Settlement Date, Maturity Date, Coupon Rate, Prevailing Yields on Similar Bonds, 1 or 2 or 4 for Annual or Semiannual or Quarterly) | Calculates the "duration" of a security (bond) that pays periodic, fixed interest; "duration" is approximately how long it takes to earn back your initial investment. |
You often use these functions when making investment decisions.
For example, is the annualized return on one company's stock above your target? How does it compare with those of similar companies in the sector?
Or, for properties, what is the annualized return you could earn on an office or apartment you rent out to others, and how does it compare with the potential returns for similar properties in the area?
Here's an example of the IRR function:
The IRR is 8.1%, which is above our target of 7.0% for similar properties, so it makes sense to invest.
IRR assumes regular, annual dates; if you want to use non-annual or irregular dates, use XIRR:
With functions such as IRR, XIRR, and MIRR, you compare the annualized rate of return to your targeted return and decide based on that.
With functions such as NPV and XNPV, you compare the present value of an investment to its “asking price” or purchase price. If the present value is higher, invest; if not, do not invest.
Here’s an example of Present Value, calculated with the NPV function (which, confusingly, calculates Present Value – not Net Present Value):
The IPMT, PPMT, and FV functions are common in real estate. Real estate loans, such as mortgages, are issued to fund property purchases, and they must be repaid over time.
Principal repayments represent repayment of the initial loan balance, while interest payments are additional costs to compensate the lender for taking a risk by issuing the loan. Here’s an example:
These functions only work if the interest rate and the number of periods are constant – if they’re not, then you will need to calculate the interest and principal repayments manually.
The FV function is often used to calculate leasing commissions, which are paid to brokers who find new tenants or get existing ones to renew.
For example, if a tenant moves in and pays $10,000 in rent per year initially, the rent increases at 3% per year, the lease term is 5 years, and the commission percentage is 4%, then the commissions are:
Finally, the YIELD, PRICE, and DURATION functions are typically used with bonds.
You purchase a bond from a company or government, thereby acting as the lender, and then the company or government pays a fixed interest rate to you and promises to repay the bond in full when it “matures” in the future.
Bond math is more complicated than it sounds because the bond’s market price may be different than its “face value” or “par value.” For example, a company could issue a bond at $100, but its price might rise to $105 in the secondary market.
Also, there is no guarantee that the company or government will repay the bond in full, and if yields and rates on similar bonds change, this bond will be worth a different amount.
The Yield to Maturity (YTM) calculates the IRR if you buy a bond at a certain price, hold it to maturity, and receive interest based on a fixed coupon rate until then. You must also assume a repayment percentage and prevailing yields on similar bonds in the market:
The YTM of 6.32% is less than the coupon rate of 7.00% because the bond is 5% more expensive than the price of $100 it was initially issued at (the “par value”).
A bond’s price is similar to the concept of Present Value: given a set of cash flows, a maturity date, a repayment percentage, and yields on similar bonds (the Discount Rate), what is this bond worth?
It accepts almost the same inputs as the YIELD function:
This bond’s price is $111 because it offers a much higher coupon rate than other, similar bonds in the market. So, even though its market price is $105 rather than $100, it’s still a good deal because its price “should be” $111.ss than the coupon rate of 7.00% because the bond is 5% more expensive than the price of $100 it was initially issued at (the “par value”).
Finally, Duration measures approximately how long it takes to earn back your initial investment when you purchase a bond.
Investors use it to manage their portfolios and weight them appropriately between short-term and long-term bonds. The set of inputs is very similar:
The Duration here is 7.5 years, which means that even though the bond matures in 10 years, we earn back our full investment in only 7.5 years – assuming the interest payments all arrive on time.
However, Duration also ignores the current market price of the bond, so it will not produce accurate results if that price is much higher or lower than the bond’s par value.
Lookup Functions and INDEX/MATCH
Lookup functions let you find specific values in ranges or tables of data.
The main lookup functions – HLOOKUP and VLOOKUP – have performance and power limitations, but you need to know them because many people do not understand the INDEX and MATCH functions. VLOOKUP will find a value in the leftmost column of a range, go down to that row number, and return the value at the intersection of that row number and a column number you specify.
HLOOKUP will find a value in the top row of a range, go across to that column number, and return the value at the intersection of that column number and a row number you specify.
The inputs look like this:
=VLOOKUP(Value You Want to Find in Leftmost Column, Range of Cells Where You're Looking, Column # of Matching Entry You Want to Get, Approximate or Exact Match)
=HLOOKUP(Value You Want to Find in Top Row, Range of Cells Where You're Looking, Row # of Matching Entry You Want to Get, Approximate or Exact Match)
You can also use the ROWS and COLUMNS functions with a table or range of cells to get the number of rows or columns, which makes the lookup functions more flexible if you're looking for data in the bottom row or rightmost column.
Here’s how you can use VLOOKUP with the customer order data used in this course:
You can also use the * and ? characters to search for items that start or end with certain letters, or items that have a certain number of characters.
In the examples below, the Order_Range named range represents the entire area in which we’re searching, and 8 is always the column number for “Amount”:
Example Formula | What It Does |
---|---|
=VLOOKUP("Natacha Tate",Order_Range,8,FALSE) | Returns order amount placed by first instance of Natacha Tate. |
=VLOOKUP("Ka*",Order_Range,8,FALSE) | Returns order amount placed by first instance of a customer whose name starts with "Ka". |
=VLOOKUP("????????",Order_Range,8,FALSE) | Returns order amount placed by first instance of a customer whose name has 8 letters. |
=VLOOKUP("*son",Order_Range,8,FALSE) | Returns order amount placed by first instance of a customer whose name ends with "son". |
=VLOOKUP("Ha*fe",Order_Range,8,FALSE) | Returns order amount placed by first instance of a customer whose name starts with "Ha" and ends with "fe". |
These functions may seem useful, but they have some big limitations as well.
First, you can only go from left to right or from top to bottom, and you need to know the exact column or row number of the data you’re looking for.”:
So, if the size of the range might change in the future due to additional rows or columns, these fuctions are risky to use.
But the main issue is that HLOOKUP and VLOOKUP are volatile functions, which means that Excel recalculates them and any cells that depend on them whenever anything in the file changes.
INDEX and MATCH are superior because they’re more flexible and also non-volatile, which means greater efficiency in large files.
The INDEX function returns a value at the intersection of a row number and column number in a range. The syntax is:
=INDEX(Range of Cells, Row Number, Column Number)
The column number is optional; if you don’t enter anything, Excel will assume “1,” which makes it work only for 1-column ranges.
INDEX by itself is not that useful because you still need to know the row and column numbers to use it.
You usually combine it with the MATCH function, which searches for data in a specific row or column and then returns its relative position (e.g., row #2 in a single column with 8 rows in it). The syntax is:
=MATCH(Lookup Value, Lookup Range, Match Type)
You pass in a single row or column for the “Lookup Range” part – normally the top row and leftmost column of the range you want to search. The Match Type is normally 0 or FALSE to indicate that you want an Exact Match, but you can also use 1 for “Less Than” and -1 for “Greater Than.”
When you use these functions together, you must make sure that the "sizes" of the Lookup Ranges in the MATCH functions are the same as the size of the Range in the INDEX function.
For example, if the Range in the INDEX function is 20 rows x 30 columns, then any column that you pass into a MATCH function that’s used in this INDEX function must also have 20 rows, and any row must have 30 columns.
Here’s how you can use INDEX and MATCH to replace the VLOOKUP function shown above:
The advantage is that if we ever add or delete columns or rows in this range, this formula will still work because it searches for the position of the “Amount” column and uses that.
The INDEX and MATCH combination has obvious uses in data analysis, but you also use it in financial models. Here are a few examples from the Walmart model:
Use Case #1: Scenarios
Use Case #2: Retrieving Data for Comparable Public Companies
If you go to the Public_Comps_Data sheet, you can see the Pub_Comps_Range area, the Pub_Comps_Params column, and the Pub_Comps_Tickers row:
This INDEX/MATCH formula searches for a financial stat, such as “Effective Tax Rate,” in column B, and determines its row number there.
Then, it searches for the company’s “ticker,” such as COST for Costco or WBA for Walgreens, in the top row, and finds its column number there.
In this example, the row number is 14 for Effective Tax Rate, and the column number is 6 for COST.
INDEX then retrieves the value at the intersection of column 6 and row 14, which is 20.7%.
INDIRECT
The INDIRECT function lets you create cell references out of text. For example, instead of writing a formula like this:
=AVB_Data!F3
You could use INDIRECT to write:
=INDIRECT("AVB_Data!F3")
INDIRECT is most useful when you have many sheets of data with similar but slightly different names, and you want to write formulas that retrieve and aggregate data from all of them.
For example, in an oil & gas model, you might use it to retrieve production data from all the regions that a company operates in. In a real estate model, you might use it to retrieve rental data from different properties or regions.
You can also use INDIRECT with the SHEET function to find the “position” of a spreadsheet in a file:
The disadvantage is that INDIRECT is a volatile function, just like HLOOKUP and VLOOKUP, so Excel will always recalculate it, along with any cells that depend on it, when anything in the file changes.
However, sometimes it’s worth using anyway because it lets you do things that might normally require VBA or macros (which can also create performance issues).
One example in the course is for a file with rental rates, occupancy rates, and revenue for several REITs that operate in different regions of the U.S.
In that lesson, we want to line up each REIT’s performance in each city or region to make a side-by-side comparison using one single formula in Excel. Here are the sheets in the file:
Each name has the REIT’s ticker followed by “_Data”, so this is an ideal use case for INDIRECT: each name is similar but slightly different.
We use this formula for the data retrieval:
The INDIRECT($C11&$D$2 part creates the “AVB_Data” and “EQR_Data” and “AIV_Data” (and so on) names of the spreadsheets.
B2:M100 is the entire range we’re indexing, B2:B100 is the column we want to search, and B2:M2 is the row we want to search. Each one of these is in the other spreadsheets, such as AVB_Data and EQR_Data.
Then, the formula tries to find a match for “Boston, MA” in the B2:B100 column of this other spreadsheet and for “Q2 19 Rental Increase %” in the B2:M2 row of this other spreadsheet.
If it finds matches for both, it returns the value at the intersection of those row and column numbers. If it does not, the IFERROR function around it displays “” for a blank cell.
Array Functions
Entering a formula as an “array function” changes it so that the formula operates on multiple values or cells instead of just a single value or cell.
Some built-in functions, such as SUMPRODUCT, SUMIF, and SUMIFs, already accept arrays or ranges as input.
Other functions, such as IF and IFERROR, do not. Therefore, if you want them to act as array functions instead, you must enter them by pressing Ctrl + Shift + Enter rather than the normal Enter key.
Other built-in functions, such as TRANSPOSE, must be entered as array functions to work properly (i.e., if you do not use Ctrl + Shift + Enter, they won’t work at all).
TRANSPOSE, which turns rows into columns and columns into rows, is the most common use case for array functions. Here’s what it looks like:
You can also use array functions to write formulas for MINIF and MAXIF (i.e., to find the minimum or maximum number in a range that satisfies a certain condition).
Here’s our version of MINIF:
You can’t see it here because of how Excel displays formulas in Edit Mode, but it has the { and } since we entered it as an array function with Ctrl + Shift + Enter. The formula text is:
=MIN(IF((Orders!$K$3:$K$1001>=Summary!B7)*(Orders!$K$3:$K$1001<Summary!B8)>0,Orders!$J$3:$J$1001,""))
The Summary!B7 and Summary!B8 cells contain the start date for the previous year (January 1, 2016) and the start date for this current year (January 1, 2017).
The first IF statement checks if the order date is in between those dates. If it is, then the multiplication operator produces 1 * 1, or TRUE. If it is not, then it produces 0 * 1, 1 * 0, or 0 * 0, all of which are 0 or FALSE.
Then, if it is TRUE, the formula gives us the corresponding Order Dollar Amount in column J of the Orders spreadsheet. We end up with an entire collection of Order Dollar Amounts that meet the date criteria, and then we apply the MIN function to that collection.
If the date condition is FALSE, we use “”. We cannot use 0 because 0 will always be the MIN of this set of numbers! The double quotes “trick” Excel into ignoring orders that do not meet the date condition.
We must enter this as an array function because IF does not normally work on a range of cells.
We can’t write this function using AND instead of the multiplication operator because Excel will evaluate the condition against the entire range of dates rather than going row-by-row.
Formatting Functions
The built-in formatting functions are fairly straightforward: you use them to modify text, dates, and times, and to convert long strings of text into separate columns for use in data analysis.
Text and Formatting Functions and Text to Columns
These functions are moderately useful, but more for data analysis than financial models:
Function or Operator | What It Does |
---|---|
& | Joins together text with cells, functions, and numbers. |
=SUBSTITUTE(Within Text, Text to Replace, Text to Swap In) | Replaces text based on a search for "Text to Replace". |
=TEXTJOIN(Delimiter, Ignore Empty Cells, Text 1, Text 2, …) | Joins together pieces of text with a character (the "delimiter") in between. |
=TEXT(Value, Format Text) | Displays a cell's contents in a different format. |
Alt, M, T (PC Only) | Accesses all built-in text functions. |
Text to Columns: Alt, A, E (PC Only) | Separates text into separate columns based on a delimiter or fixed widths. |
=LEFT(Text, # Characters) | Gives characters from start of text until... |
=RIGHT(Text, # Characters) | Gives characters from end of text until... |
=MID(Text, Start #, # Characters) | Gives characters from position # in text until... |
=TRIM(Text) | Removes extra spaces. |
=PROPER(Text) | Makes first letter in each word uppercase. |
=CLEAN(Text) | Removes all non-printable characters from text. |
=UPPER(Text) | Makes text all uppercase. |
=LOWER(Text) | Makes text all lowercase. |
=FIND(Text to Find, Within Text) | Searches for text within text; case sensitive. |
=SEARCH(Text to Find, Within Text) | Searches for text within text; not case sensitive. |
=LEN(Text) | Gives # of characters, including blank spaces, in text. |
=REPLACE(Within Text, Start #, # Characters, Text to Swap In) | Replaces text within text based on the starting position and # of characters to replace. |
Here are examples of how to use these functions (and some additional notes):
The Text to Columns function (Alt, A, E on the PC; no shortcut on the Mac, so go to the Data tab in the ribbon menu to find it) is also useful for separating text into separate columns:
The same character must separate each field, so you can't use a mix of commas and periods; also, you'll still have to use functions like PROPER and TRIM to format the data after this split.
Date and Time Functions
The most important point about Date and Time Functions is that they depend on your Windows or Mac OS system settings!
If dates or times are not recognized when you enter them in Excel, go to Control Panel → Clocks and Regions or Clock, Language, and Region → Region and change the Short Date and Long Date formats to the settings you want (in Windows).
In Mac OS, go to the Apple symbol in the top left of the screen → System Preferences → Language & Region → and select your region there or click “Advanced” to customize the formats; use the Dates and Times panes there to change these settings.
Once you've made sure your date and time inputs are correct, the most useful functions are:
Function | What It Does |
---|---|
=DATE(Year, Month, Day) | Creates date in Excel. |
=EDATE(Start Date, # Months) | Moves forward from Start Date by # Months; # Months can also be negative. |
=EOMONTH(Start Date, # Months) | Returns last day of month after # Months have passed from Start Date; # Months can also be negative. |
=DATEVALUE(Text) | Converts text in different formats to real date; must use double quotes to indicate text input. |
=YEAR(Date) | Returns year of date. |
=MONTH(Date) | Returns month of date. |
=DAY(Date) | Returns day of date. |
=YEARFRAC(Start Date, End Date) | Returns fraction of the year between two dates (e.g., ~0.50 for June 30th and December 31st). |
The DATE function is useful for creating unambiguous dates in Excel to avoid problems with different date formats.
You often use EDATE and EOMONTH in financial models to create annual, quarterly, and monthly headers at the tops of schedules:
Functions for Scenarios, Auditing, and Data Validation
This entire section deals with functions that are “nice to have,” but not critical in the same way that the financial functions and INDEX and MATCH are.
Data Validation
Data Validation (Alt, A, V, V or Alt, D, L on the PC; no shortcut on the Mac, so go to the Data tab in the ribbon menu) lets you restrict the inputs of certain cells.
For example, you can use it to ensure that the user can enter only 0 or 1 in a certain cell:
You can also use it to allow users to select from a range of values in a drop-down menu, as we do with the scenarios in the Walmart model:
When “Allow” is set to “List,” checking the “In-cell dropdown” box will create the drop-down menu shown above.
Data Validation offers many options: Decimals, Lists, Dates, Times, Text Length, and Custom.
But you should not go crazy with it – it’s pointless to validate every single input cell in the model. Use it for the most important assumptions, and in input boxes where you want to create drop-down menus.
Also, in some versions of Excel, you may not be able to navigate to “Source” cells in other spreadsheets when you select “List,” so you’ll have to type out references to other sheets manually.
We usually use Data Validation to set up the scenario selection drop-down menu, and then INDEX and MATCH, as shown above, to set the numbers based on the selected scenario.
CHOOSE and OFFSET
If you want to use less efficient and less flexible functions for scenarios, CHOOSE and OFFSET are good choices.
CHOOSE selects a value or action from a list based on an index number you pass in, such as:
=CHOOSE(Index Number, Value 1, Value 2, Value 3, …)
=CHOOSE(Scenario_Num, 1, 2, 3)
It’s not a great choice for scenarios because you cannot pass in an entire range of cells for the values – you must hard-code them or link to individual cells separated by commas.
OFFSET starts in a specific cell and then moves down a certain number of rows and over a certain number of columns; it can also return a range of cells rather than an individual cell if you specify the height and width:
=OFFSET(Reference Cell, Rows, Columns, Height, Width)
=OFFSET(A1, 5, 5, 3, 3)
This function above will start in cell A1, move down 5 rows, move to the right 5 columns, and then retrieve a 3x3 range of cells starting in that top-left corner.
OFFSET is useful because of this ability to return a range of cells, which lets you write formulas such as:
=SUM(OFFSET(I33, 0, 0, 4, 5))
This one will sum up the 4-row x 5-column area starting in cell I33.
OFFSET can be helpful in schedules that show a company’s spending over time and its allocation in each year, such as the one for CapEx and Depreciation below:
If we break down this formula:
=IF(R$49 >= $P53, OFFSET($Q$51, 0, R$50 – ($P53 – $P$53)) * $Q53, 0)
The first part after the IF checks to see that the current year in row 49 is the same as or after the one on the left-hand side in column P, meaning that there may be some Depreciation.
Then, the OFFSET function starts in the top-left cell ("% Allocation"), stays in the same row, and then moves over the appropriate number of columns for the year we're in.
In row 1 and column 3, we're in Year #3 for the Year #1 spending. This formula becomes T$50 – ($P53 – $P$53), or 3 – 0, so we move over 3 columns.
In row 3 and column 4, we're in Year #4 for the Year #3 spending. This formula becomes: U$50 – ($P55 – $P$53) or 4 – (2021 – 2019) = 4 – 2 = 2, so we move over 2 columns.
In other words, it finds the relative position for the Depreciation percentage based on the year of the spending and the year we're in.
Besides these schedules, it’s best to avoid OFFSET because:
1) It’s a volatile function, similar to HLOOKUP and VLOOKUP, so any changes elsewhere in the file will cause formulas using OFFSET (and their dependents) to recalculate.
2) It’s difficult to trace and audit formulas when OFFSET returns a range of cells rather than a single cell.
Circular References
A circular reference means that the input of a cell depends on the output of that same cell.
For a very simple example, let’s say that we set cell Q14 equal to R14 + Q14:
Since Q14 adds itself to determine the output, it’s a circular calculation, and Excel will generate an error unless you’ve enabled iterations (Options → Formulas or Calculation → Enable iterative calculations).
A circular reference could be direct, as it is here, or it could be indirect.
For example, in many financial models, the company’s Interest Income depends on its Cash balance: a higher Cash balance means more Interest Income.
But the Cash balance also depends on the company’s Interest Income – higher Interest Income means a higher Cash balance!
Depending on the model’s setup, that relationship could create a circular reference. You’ll know it when you see text at the bottom of the Excel window that reads “Calculate”:
Circular references come up mostly with Interest Income and Interest Expense in financial models, and in calculations for a company’s implied share price in a DCF.
Some people argue that you should never include circular references in models since they can cause ascading errors and make it very difficult to modify formulas and find errors.
But you will encounter models with circular references, so you must know how to work with them.
The key point is that you must always include a way to DISABLE circular references in your models so that you can edit formulas more easily.
Using the average balances for the Interest on Cash and Debt in models creates circular references because the Ending Balances depend on Interest during the period – but the Interest during the period depends on the Beginning balance and the Ending Balance.
So, you can build in an on/off switch by adding an input cell for “Average Balances” and then checking for the condition in the interest calculations:
Selecting and Tracing Precedents and Dependents
If you ever run into a problem with a formula, it’s helpful to trace precedents (inputs or cells that flow into that formula) and trace dependents (outputs or cells that are affected by that formula).
Here are the key commands and shortcuts:
If you’re checking for consistency, it is far more efficient to press Ctrl + [ and Ctrl + ] than to go into each cell individually and press F2 or Ctrl + U.
But the simplest commands here are often the most useful ones: it’s not that helpful to see 157 arrows on the screen pointing in different directions, or to see 38 different cells selected.
Here’s an example of the Ctrl + [ shortcut for Select Direct Precedents used in cell I68:
By contrast, here’s the Shift + Ctrl + { shortcut for Select All Precedents in the same cell:
And then here’s the Trace Precedents command (Alt, M, P in PC/Windows Excel):
Error Checks and Comments
When you enter formulas in Excel, you could run into a variety of errors.
For example, maybe you divided a number by 0 (undefined), maybe you entered a function name that doesn’t exist, or maybe you used the wrong inputs for a function.
Here’s a summary of the different types of errors:
You could check for each error separately using functions such as ISNA and ISNUMBER, but the easiest solution is to use the IFERROR function, which checks for all these errors:
=IFERROR(Value or Calculation or Formula, Output If It Produces an Error)
If there is no error, IFERROR will display the output as is; if there is an error, it will display a message that you specify. A few examples from the Walmart model include:
Comments or Notes allow you to explain where a certain input, formula, or calculation came from, or why you’ve set up the model a certain way.
You can press Shift + F2 in both PC and Mac Excel to leave these comments, and you can delete comments by pressing Esc and then Del.
This feature is less useful than it seems; in practice, only one person (usually an Analyst or Associate) “owns” a financial model in investment banking.
If senior bankers want to make edits or changes, they’ll often print out the model or presentation and leave their notes on the printed pages.
Sensitivity Tables
Sensitivity tables let you vary the assumptions in a model and look at the output under a range of different outcomes.
All investing is probabilistic because you can’t possibly know what will happen 5, 10, or 15 years into the future – but you can come up with a reasonable set of potential scenarios.
Therefore, sensitivity tables are critical in making investment decisions and advising clients.
Internally in Excel, sensitivity tables are known as “data tables,” and you can access them in the ribbon menu under the Data tab → “What-If Analysis.” In PC/Windows Excel, the shortcut is Alt, D, T or Alt, A, W, T.
A properly set-up and formatted sensitivity table looks like this (taken from the Walmart DCF, where we vary the Discount Rate and Terminal Growth Rate to assess the company’s implied value):
To create this table, you must clear the white area shown above, link to the formula you want to sensitize in the top-left cell (the blank, light-blue one), select the whole area, including the white and light-blue cells, and then enter the row and column inputs:
There are a few important setup rules with sensitivity tables as well:
1)The input variables and output must be on the same spreadsheet as the table. You cannot use assumptions or drivers from other sheets, such as the 3-statement model, in this table.
2)The numbers in the input row and column cannot be linked to or from anything that’s in the model. Start each input row or column with a hard-coded number and then hard-code the rest or make them change by simple percentages or numbers.
3)The row and column inputs and the output must be related in some way. If the inputs do not affect the output, the table will show no changes as you vary the numbers.
4)Set “Workbook Calculation” in Options or Preferences to “Automatic except for data tables” or your spreadsheet will slow down, especially with many tables. You can then press F9 to refresh or update the tables.
5)Enter a direct link to the output you want to sensitize in the top-left-hand corner of the table. And then, select everything and go through the steps shown above. “Row Input Cell” should be a direct link for the input in the top row, and “Column Input Cell” should be a direct link for the input in the left column.
6)You cannot modify individual cells in the table once it has been created. If you want to change something or select different inputs or outputs, you must delete and re-enter the entire table.
You will see sensitivity tables in almost every financial model of intermediate complexity and beyond,so you must know how to use them.
They aren’t that difficult, but many students make mistakes with the points above.
Goal Seek and Solver
While sensitivity tables are useful for assessing valuations and potential returns and making investment decisions, sometimes, you want a single “answer.”
For example, what’s the most that you could pay for a company and still generate a 20% IRR if you sell it in 5 years?
Or, what’s the maximum a company is worth today if it grows by 10% per year and is worth $500 million in 10 years?
To answer these questions, you can use the Goal Seek feature (Data → What-If Analysis → Goal Seek, or Alt, A, W, G on the PC) and the Solver Add-In (Alt, A, Y3 on the PC).
The Solver Add-In is external, but it comes with Excel and can be enabled quite easily (Options or Preferences → Add-Ins → Manage Excel Add-Ins).
To use either of these, your model cannot contain circular references.
If your model does contain circular references, remove them by changing the formulas to use the Beginning Cash and Debt (for interest), and deleting circular share-price calculations.
You use Goal Seek to answer simple, one-variable questions, such as “What would the purchase price have to be for the IRR to be 15%?” Here it is in Excel:
This one produces $155 as the answer, meaning that the asking price would have to drop to $155 for the IRR to increase to 15%.
Solver allows you to add multiple constraints so you can answer questions such as, “If the maximum cash flow growth rate is 5%, and the maximum apartment sale value is $230, how much would we have to pay upfront to earn a 10% IRR?”
Here are the options Solver gives you to answer this question:
It produces an answer of $188 here:
With this $188, the growth rate is still just 3.0%, and the sale value is $225. If you want to use the maximum values for both of those, you need to change the constraints and use = rather than <=.