Insert Local Files
You can use clickhouse-client
to stream local files into your ClickHouse service. This allows you the ability to preprocess
the data using the many powerful and convenient ClickHouse functions. Let's look at an example...
- Suppose we have a TSV file named
comments.tsv
that contains some Hacker News comments, and the header row contains column names. You need to specify an input format when you insert the data, which in our case isTabSeparatedWithNames
:
- Let's create the table for our Hacker News data:
- We want to lowercase the
author
column, which is easily done with thelower
function. We also want to split thecomment
string into tokens and store the result in thetokens
column, which can be done using theextractAll
function. You do all of this in oneclickhouse-client
command - notice how thecomments.tsv
file is piped into theclickhouse-client
using the<
operator:
Note
The input
function is useful here as it allows us to convert the data as it's being inserted into the hackernews
table. The argument to input
is the format of the incoming raw data, and you will see this in many of the other table functions (where you specify a schema for the incoming data).
- That's it! The data is up in ClickHouse:
The result is:
- Another option is to use a tool like
cat
to stream the file toclickhouse-client
. For example, the following command has the same result as using the<
operator:
Visit the docs page on clickhouse-client
for details on how to install clickhouse-client
on your local operating system.