Rows in datasets

Hi, all I want to do is return the same dataset but with only the first row left. This doesn't work since the grabbed headers aren't the right data type. How can I alter datasets efficiently?

value is a JsonDataset to PyDataset

image

Search this forum for DatasetBuilder examples.

I'm sorry but... those few lines hurt my brain and soul.

  • there's an unconditional return in a loop, it's ALWAYS going to return and the following iterations of the loop will never happen
  • it's followed by a break !
  • if the loop was actually looping, you'd be pulling the same headers at every iteration

Concerning the actual problem: The headers you get from getColumnNames are actually not a sequence toDataSet accepts. It boggles my mind, but that's how it is. Cast the whole thing to a list and it will work just fine.
The next problem you'll have is that row is a simple list (well... almost), but it should be a list of lists.

Here's something that should work (can't test it though):

return system.dataset.toDataSet(
    list(value.columnNames),
    [value[0]]
)

Assuming value[0] is in a format toDataSet accepts. Otherwise, cast it to a list as well, I'm pretty sure that works.

5 Likes

The list returned by a dataset's columnNames property is perfectly acceptable to a DatasetBuilder's .colNames() method. Same with a dataset's .columnTypes property and a builder's .colTypes() method. Then you can construct one row as a simple list, and add it to the builder with a single .addRow(). Then .build() and return.

(If you need to be sure column types are transferred, use DatasetBuilder.)