CSV.Document is a feature of Power Query on Microsoft Excel and Power BI, making it easy to load CSV files in table format. Find out everything you need to know about how it works, its usefulness and its benefits!
Invented even before the personal computer, the CSV (Comma-Separated Values) format is regarded as the standard for exchanging data between different applications and systems.
CSV files are easy to read and edit. However, handling them can be tedious when working with large volumes of data. In the age of Big Data, this limitation is problematic.
With Microsoft Excel and Power BI, the CSV.document function in Power Query’s M language is an invaluable aid. It enables users to load CSV files simply and efficiently.
💡Related articles:
What is a CSV?
A CSV is a text file containing data structured in the form of a table. Within this file, each line represents a row in the table.
Each field in a line is separated by a comma, hence the name “Comma-Separated Values”.
A file can be opened or modified with any text editor, including Notepad. The strength of this format is that it can be read by a large number of applications and systems.
This makes it easy to use this type of file for data exchange. However, with the massive increase in the volume of data, the task becomes more complicated.
CSV files containing a lot of data are very large, and can take a very long time to load. The Power Query engine avoids this inconvenience.
What is Power Query?
The Power Query data transformation and preparation engine enables users to load, transform and merge data from multiple sources in Microsoft Excel, Power BI and other Power Platform products.
Its intuitive graphical interface lets you filter, sort and manipulate data without any programming knowledge.
Data is retrieved from a variety of sources, including SQL databases, web pages, XML files and CSV files.
It is then stored in a table, which the user can interact with. The data can be filtered and sorted, or modified as required.
Power Query’s M programming language is used to transform or manipulate data in Power Platform tools such as Excel and Power BI. It can be used to create queries to import, clean and transform data.
To load the contents of a CSV file into Microsoft Excel or Power BI in the form of a table with Power Query, use the CSV.Documents function in the M language.
How do I use CSV.Document on Power Query?
The Csv.Document function is used to import and transform data stored in a CSV file stored locally or on a remote server, and then transform it into a structured table for analysis and display.
Its syntax is as follows:
= Csv.Document(file, optional options)
The ‘file’ parameter indicates the location of the CSV file to be imported. The ‘options’ parameter is used to specify additional options for reading the file.
These include the field separation character, character encoding or the absence of column headers. By default, the ‘Delimiter’ is a comma, but it is possible to use a different delimiter.
The “Encoding” option designates the character encoding to be used to read the CSV file. By default, Power Query automatically detects the correct encoding by analysing the file.
Once the data has been imported, other Power Query functions can be used to transform the data, such as Split Column, Group By or Filter Rows. Here’s an example:
let
Source = Csv.Document(File.Contents(“C:\monfichier.csv”),[Delimiter=”,”, Columns=5, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#”First row header” = Table.PromoteHeaders(Source),
in
#First header row
In this case, CSV.Document is used to load a CSV file stored on the computer’s hard disk with a comma delimiter and 5 columns. The File.Contents function is used to specify the location of the CSV file, and Table.PromoteHeaders is used to promote the first row of the CSV file as column headers in the resulting Power Query table.
Advantages of CSV.Document over Power Query
Using a CSV.Document with Power Query offers a number of advantages. It simplifies the import of CSV files into Excel, while enabling data transformations to be carried out.
This saves invaluable time, since it is no longer necessary to enter data manually. The import process is also automated. What’s more, importing data in the form of a structured table makes it easier to handle and analyse.
What’s more, Power Query can be configured to automatically update data whenever a CSV file is modified. This eliminates the risk of human error and ensures that the data is always up to date.
Power Query’s compatibility with many other sources also means that you can work with different types of data from a centralised interface.
Finally, by integrating data from different sources into a single CSV file, a single table can be generated for easy analysis.
💡Related articles:
Conclusion: CSV.Document, a simple method for importing into Excel
In conclusion, CSV.Document Power Query is a very useful feature for importing large CSV files in table form into Excel or Power BI rather than copying the data manually.
To learn how to master Power BI and all its subtleties, you can choose DataScientest. Our Power BI training course gives you complete mastery of the software in just 38 hours spread over 5 days.
The first part of the course takes 3 days and covers Power Query, the Dax language and the basics of dataviz. The second part lasts 2 days, and allows you to learn how to use the tool in greater depth, while discovering advanced concepts such as incremental updating and DataFlows.
You can take the whole course, or just one of the two parts, depending on your level and needs. At the end of the course, you can sit the Power BI Data Analyst Associate certification exam.
This programme can be completed entirely by distance learning via the web, and our organisation is eligible for funding options. Discover DataScientest!