Working with a Data Extract


1. Extract Data

>> Concept

An extract is a copy of the data that brought into the Tableau data engine.


>> Advantages

  1. Use the Tableau data engine to run queries rather than send a query to the data source
    • Reduce the time it takes for queries to run
  2. Allow you to keep a copy of the data:
    • Can be accessed offline
  3. May include only a subset of the data

>> Limitations

  1. Data doesn’t update automatically
    • Need to refresh the Extract
    • A Live connection queries the data from the database and the data are updated every time you open your workbook
  2. Your extracted data source may not include all the fields required for the views
    • Because extracts may include only a subset of the data


2. Create and edit Extracts

>> Two places to create extracts

  • From the Data Source page

    image-20210602152449487
  • On the worksheet

    - [Data] Pane --> right-click the data source --> [Use Extract]

    - Switch between a live or extracted data connection: check / uncheck [Use Extract]

    image-20210602152258117

    image-20210602152950020


>> Two places to edit Extracts

  • From the data source page

    image-20210602161938019
  • On the worksheet

    - [Data] Pane --> right-click the data source --> [Extract Data]

    image-20210602154111980

    [Edit extracts from the Extract Data dialog box]

    image-20210602152648443

>> Update Data

  • [Data] Pane --> right-click the data source --> [Extract] --> [Refresh]

    image-20210602162736858

>> Hide / Unhide fields

  • Hide unused fields

    image-20210602164034539
  • Unhide fields

    • [Data] Pane --> right-click --> [Show Hidden Fields]

      image-20210602164326135 image-20210602164355176

    • Field to show --> right-click --> check [Unhide]

      image-20210602164533673


3. Refresh Extracts

[Tableau Help] – Refresh Extracts (中: 刷新数据提取 / 한: 추출 새로 고침)

>> Two Types:

  • Full extract refresh (default) [完全刷新]
    • all of the rows are replaced with the data in the original data source
    • [good] ensures that you have an exact copy of what is in the original data
    • [bad] can sometimes take a long time and be expensive on the database
  • Incremental extract refresh [增量刷新]
    • configure a refresh to add only the rows that are new since the previous time you extracted the data
    • Note: If the data structure of the source data changes (for example, a new column is added), you will need to do a full extract refresh before you can start doing incremental refreshes again.

>> Full Refresh (Default)

  • [Data] Pane --> right-click the data source --> [Extract] --> [Refresh]

    image-20210602162736858

>> Update the data extract

Overwrite the existing extract by creating a new extract after the data extract edit (e.g. unhide fields)

  • [Data] Pane --> right-click --> [Extract Data]

  • [Number of Rows] --> check [All rows] --> check [Incremental refresh]

  • Click [Extract]

    image-20210602165123365 image-20210602165055997 image-20210603164635661