Jump to content
  • Spotfire® Data Science - Team Studio Glossary


    Spotfire® Data Science - Team Studio Glossary

    Associate a Dataset

    Associating a dataset makes that data available to the workspace under the Data tab. These are datasets that are not a part of the sandbox schema. You can use these datasets to import data that is not contained within the main data source of the workflow.

    Comment

    People can comment on anything in the Activity pane in the workspace overview. Comments can be edited or deleted.

    Connect View

    A Connect View is a generated table created by a join or select statement on a database data source. It is stored locally on the application database.

    Custom Operator

    You can integrate your own algorithms and processes into the Team Studio analytics engine using the Custom Operator Framework. Custom Operators are written using Scala or Java. They can use Spark for advanced machine learning and transformations.

    Dataset

    Datasets come from databases, HDFS, or uploaded files such as CSV files. They are used within workflows, Touchpoints, and sandboxes in order to perform analyses.

    Data Source

    A data source refers to an external data provider, either on Hadoop or as a relational database.


    Data Source States

    Online

    This has correct connection information and can be used in Team Studio.

    Offline

    This indicates that this data source is having trouble connecting properly. Verify connection parameters and try again.

    Incomplete

    A user has begun to add this data source but has not completed the connection parameters. Users can save some data sources as incomplete. The data source is not usable until the rest of the data is provided.

    Disabled

    This has been disabled by an administrator and cannot be used until it is turned back on. Users may not open workflows or run jobs using this data source.


    Dashboard Widget

    On the homepage of Team Studio after logging in, people can customize their view to create a dashboard of important information. Click the gear icon on the home page to customize widgets.

    Design Time

    Design time refers to the actions a person takes before running an operator or workflow. He or she is "designing"/customizing the operator's parameters. This is an important concept to understand when learning about Custom Operators.

    Insight

    Insights are pieces of information that are deemed important to a particular workspace or a workflow. People can add an insight directly or promote a note or a comment to an insight. One can also attach workflow results, files, datasets and other files uploaded from the desktop to support the importance of the finding.

    Job

    People can schedule a Job to be run on a regular time interval, or on demand. This is useful for updating data automatically over time or running specific tasks overnight. Jobs can be customized to run on an hourly, daily, weekly, or monthly schedule. Based on the configuration, team members can be notified of the results of success or failure.

    Milestone

    A Milestone refers to a section of work that a team member works on. By setting a due date, people can see at a glance the progress of a particular analytic project. Milestones are shown on the workspace page under the "Milestones" tab, and also on the workspace overview. One can also change the status of the project to one of the 3 available options: On Track, Needs Attention, At Risk.

    Note

    People can make notes on a workspace or any workfile within that space. Notes will show under the Activity pane in the workspace overview and will be viewable to those with access to the workspace.

    Notes

    Can be promoted to Insights and commented on by other people.

    Notification

    Notifications alert people of important changes in the application, such as job results or collaboration information. When someone adds another person to the workspace, that person will get a notification. Many types of activities can have notify people and this can be configured individually.

    Operator

    An operator encapsulates some algorithm or transformation within a workflow. They show up as a list on the sidebar of the application within the workflow editor and can be dragged to the canvas and connected to other operators or data sources. They are one of the main building blocks for workflows -- the other being data sources. People can filter the operators based on the intended operation such as Load, Explore, Transform, Model, Predict and Tools.

    Run

    Running a workflow runs the entire workflow, executing all operators and data sources. People will not be able to run a whole workflow with invalid operators. Alternatively, they can use a step run. Like step run, selecting "Run" only executes operators that have not run before. To re-run everything from scratch, select "Clear Step Run Results" from the contextual menu.

    Sandbox

    A sandbox is a place in the Data tab where people can bring in their training schema and perform simple explorations with the help of visualizations. One can also create external views from the sandbox datasets. Sandboxes are only available for database data sources.

    Shared Account

    A shared account can be used to provide access to a database data source for multiple members of an organization. This means that people will share a single set of credentials for the data source.

    Team Studio Model

    Team Studio models are generated from a workflow, and then saved as .am files. This allows the portability of models from one workflow to the other without having to build the whole model again. The Team Studio models are saved under the Work Files tab of a workspace.

    Step Run

    By right-clicking an operator's icon and selecting "Step Run", people can run only the operators and data sources needed to get up to the selected operator for step run in a workflow. Operators that have already been executed will not execute again. This allows for faster iteration so that the person does not need to run the entire workflow again.

    Sub-flow

    The Sub-flow operator allows people to run another workflow from the current workflow.

    Tag

    Tags can be used to categorize datasets or results within the application. Tags can be added to any workfile. Selecting a tag brings up a view of all workfiles with that tag.


    Touchpoint

    Team Studio Touchpoints wrap the functionality of complex workflows in an interactive application that can be consumed by the business analyst.

    Touchpoint Catalog

    If the Team Studio instance includes Touchpoints:

    By selecting the "Touchpoints" item from the sidebar, people can see the collection of Touchpoints that team members have published.

    Publish a Touchpoint

    When a developer finishes creating a Touchpoint, he or she can "publish" it to the Touchpoint catalog, which means it will be available to all members of the application. Depending on the Touchpoint's settings, the Touchpoint will be run either as the creator or the person. An unpublished Touchpoint can still be run by members of the workspace the Touchpoint resides in.

    Unpublish a Touchpoint

    To remove a Touchpoint from the catalog, you can unpublish it. Unpublishing means that the Touchpoint is no longer publicly available for people in the application to view and run it. It is restricted to the members of the workspace from where the Touchpoint originated.


    Workfile

    A workfile is any file that is within the Work Files section under the workspace view in the application. Workfiles can be workflows (analytic processes with operators), but they can also be SQL files, CSV files, Touchpoints, Team Studio Models, or result files saved to the workspace. Additionally, people can upload other types of files (such as ZIP files) for reference.

    Workflow

    A workflow is a collection of datasets and operators that performs analytic tasks. It is the place where data scientists build out models using the available operators and algorithms in Team Studio. In the Work Files view, workflows are displayed as pages with the Team Studio logo attached.

    Workflow Variable

    People can override default parameters and define their own workflow-wide variables using a workflow variable. To set one up, click the menu Actions > Workflow Variables from a workflow. All workflow variables start with the character @. They are commonly used for input in Touchpoints. They can also be used to edit default paths and parameters for HDFS.

    Workspace

    Team members can use workspaces to collaborate on a data science project. Workspaces hold workfiles, and they can also have scheduled jobs, milestones, and associated database and Hadoop datasets under the Data tab to keep track of progress on a project. A workspace can be either public or private. People can create a workspace from the workspace page.


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...