Heredoc
A heredoc (here document) is a way to create a multi-line string in Ruby. It's typically used for creating multi-line messages, SQL queries, or HTML.
A heredoc is defined with an opening symbol of either <<
, <<-
, or <<~
followed by an identifier (typically all uppercase) that wraps either side of the document.
The base form of a heredoc requires the closing identifier to have no indentation. For that reason, it is not used as much as the other forms.
" msg1 = <<TXT
" Some text
" More text
> TXT
=> " Some text\nMore text\n"
> puts msg1
Some text
More text
=> nil
" msg2 = <<TXT
" Some text
" More Text
" TXT
" ope, that needs to not be indented
> TXT
=> " Some text\n More Text\n TXT\n ope, that needs to not be indented\n"
We can see how weird <<
gets when we have an extra layer of indentation, like we often would in our Ruby programs.
* msg2 = begin
" <<TXT
" one
" two
" TXT
" TXT
* TXT
> end
=> " one\n two\n TXT\n TXT\n"
If we want to preserve the indentation in our heredoc text while allowing our terminating identifier to be placed on whatever we feel is an appropriate column (indentation level), then we want the <<-
form:
" msg1 = <<-OTHER
" one
" two
> OTHER
=> " one\n two\n"
* msg2 = begin
" <<-OTHER
" one
" two
* OTHER
> end
=> " one\n two\n"
If the leading indentation is good for the readability of our program, but not useful in the resulting string, then we should reach for the squiggly heredoc (<<~
):
" msg1 = <<~RUBY
" three
" four
" five
> RUBY
=> "three\nfour\nfive\n"
" msg2 = <<~RUBY
" three
" four
" five
> RUBY
=> "three\n four\n five\n"
Heredocs without the interpolation
Wrap the initial identifier in single-quotes and you get a heredoc that won't perform string interpolation.
" <<-NUMBERS
" 1
" #{1 + 1}
" 3
> NUMBERS
=> " 1\n 2\n 3\n"
' <<-'NUMBERS'
' 1
' #{1 + 1}
' 3
> NUMBERS
=> " 1\n \#{1 + 1}\n 3\n"
A small detail to notice about how IRB handles these two heredocs is the first is decorated with leading "
characters since it is a heredoc that supports interpolation. The second, a single-quote heredoc, is decorated with leading '
characters.
Interacting with a Heredoc
A heredoc can have methods chained onto it. A heredoc can be used directly as an argument to a method. And we can do both of those things at once. It looks a bit funky at first, so let's build up a few examples.
To start, let's say we are building up a SQL query in a heredoc and we want to remove leading and trailing whitespace. We can use the String#strip
method.
# as is
query = <<-SQL
select *
from books
where -- ...
SQL
#=> " select *\n from books\n where -- ...\n"
# with whitespace removed
query = <<-SQL.strip
select *
from books
where -- ...
SQL
#=> "select *\n from books\n where -- ..."
Notice the #strip
method call is chained onto the opening identifier of the heredoc. It might look a bit odd, but that is how it works.
Similarly, if we want to pass a heredoc directly as an argument to a method, only the leading identifier goes in the method call and the rest comes on the lines directly after.
# without parentheses
ActiveRecord::Base.execute <<-SQL
select *
from books
where -- ...
SQL
# with parentheses
ActiveRecord::Base.execute(<<-SQL)
select *
from books
where -- ...
SQL
Now let's combine both concepts and see method chaining on the heredoc as it is passed as a method argument.
ActiveRecord::Base.sanitize_sql_array([<<~SQL, user_id: @user.id])
select *
from books
where user_id = ?
-- ...
SQL
That opening heredoc identifier looks sorta like it has been orphaned in that first line, but Ruby knows to parse the following lines as the body of the heredoc.
Part of why the above syntax can look off is that editors and syntax highlighters don't always know exactly how they should render it.
Stacked Heredocs
The previous section addressed the question of how a heredoc is passed as an argument to a method. In this section, we look at how we can pass multiple heredocs to a method.
I've fabricated a DB
class with #transaction
, #execute
, and #sanitize
methods to keep the following example focused. We have a #run_queries_in_transaction
method that takes a variable number of queries and executes them in a transaction. Below that we call the method will a series of stacked heredocs.
def run_queries_in_transaction(*queries)
DB.transaction do
queries.each do |query|
DB.execute(query)
end
end
end
job_id = 123
run_queries_in_transaction(<<~Q1, <<~Q2, DB.sanitize(<<~Q3, job_id))
update users
set status = 'inactive'
where last_logged_in_at > (now() - '6 months'::interval);
Q1
insert into events (name, description)
values ('inactive_users', 'Mark users as inactive');
Q2
delete from background_jobs
where id = ?;
Q3
Notice we are able to list the heredoc arguments one after another using the opening identifier like we would any other argument to a method. Each heredoc body is listed directly below that separated by its unique identifier.
Heredoc Oddities
A single-quoted heredoc with an blank identifier can be created.
" msg1 = <<''
" hello
" world
>
=> " hello\n world\n"
This 'end-less heredoc' falls under the category of "please don't do this."
There are also backtick heredocs that can execute commands.
puts <<-`HEREDOC`
cat #{__FILE__}
HEREDOC
For a better explanation and example of that, head over to the backtick page.