Jump to content

How to avoid empty cells when summing across multiple tables?


Sevda
Go to solution Solved by Olivier Keugue Tadaa,

Recommended Posts

 

Here’s an improved version of your text with enhanced clarity and organization:


Hello everyone,

I am new to Spotfire and have encountered an issue while working with multiple tables. I have two tables that share some common columns, such as 'Region'. My goal is to create an overview that combines data from both tables and shows values such as the purchases for example. To achieve this, I used column matches to align the tables based on the 'Region' column and visualize them together in a cross table.

However, I’ve run into a problem: this approach works well for regions that exist in both Table 1 and Table 2, but for regions that only exist in one of the tables, I get empty values. For instance, as shown in the attached screenshot, the region 'China' is present in Table 1 but not in Table 2, resulting in an empty cell where I expected the blue value.

I’ve considered using an outer join to solve this issue, but I’m dealing with large datasets and plan to add more tables in the future. Therefore, I would prefer to keep the tables separate. I’ve tried writing a function that adds the missing regions to each table, which works, but I don’t think it’s a practical long-term solution—especially as I anticipate having other columns with the same issue like 'Region'. Writing individual functions for each would become cumbersome.

Since I’m still learning Spotfire, I’m sure there’s a more efficient way to handle this without having to join the tables, but I haven’t been able to figure it out. I would greatly appreciate any suggestions or solutions for dealing with this issue.

Thank you in advance for your help!

Screenshot 2024-10-02 114902.png

Link to comment
Share on other sites

Hi Sevda,

I would try out a calculated column on the joined/merged data table.
You can use a CASE statement to replace the NULL values with a 0. See also the screenshot below:
image.thumb.png.c3529dc29f5303869af218996864a64c.png

The region_id is the original column (like your table2 column). If you add a calculated column, using this code (CASE  WHEN [Table 2] is null THEN 0 else [Table 2] END), you may fix the issue.

Kind regards,

David

Link to comment
Share on other sites

Hi David,

Thank you for your suggestion! I had considered this method, but it would require joining the tables, which I’m hoping to avoid. As I mentioned earlier, both tables have hundreds of columns and hundreds of thousands of rows. Joining them would result in a very large and complex table with many columns and numerous missing values, making it difficult to manage in the long term.

Is there any way to keep the tables separate and still achieve this without joining them?

Link to comment
Share on other sites

Hi Sevda,

Right, even though you mentioned that, I thought by looking at your example you made some connection between the two tables.
But can I assume that you want to combine the data from both tables without joining them (only use matching columns or a relation between them)?

Kind regards,

David

Link to comment
Share on other sites

Hi David,

Yes, if possible, I would prefer not to join or merge the tables, for the reasons I mentioned earlier. In my cross table example from above, I only used column matches because I understood that this would allow me to use both tables simultaneously in one visualization. However, I realize that my initial understanding may not be correct or that this approach may not be appropriate for my situation. I’m open to other suggestions that don’t involve merging or joining the tables. Any further advice would be much appreciated!

Link to comment
Share on other sites

  • Solution

Hi Sevda, 

When you use multiple tables on a single visualization, there is always a "controlling table" from which the column matches apply. Therefore in your case, "table 1" will show all the values for the regions. For the ones that are not in "Table 2",  a standard addition will show null (since the second value is null).  I can propose a workaround which is using the SN (substitute null) function for the other tables rather than the controlling table
 

-- This expression will be empty for missing data un "table 2"
Sum([value1]) + Sum([Data Table (2)].[Value 2])

-- This expression will display the right value even for missing data in "table 2" 
Sum([value1]) + SN(Sum([Data Table (2)].[Value 2]),0)

 

Let me know if this helps.

See the attached sample image 

image.thumb.jpeg.d1d6b62dd58a3c36e9e2a5867cd6ee0f.jpeg

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...