# Basic table operations#

This section will cover essential basic functionality in crandas, including inspecting, assigning new columns, computing with columns, boolean logic, and manipulating column names.

## Inspecting `CDataFrames`#

While you cannot normally access the data stored in a `CDataFrame`, you can use the `CDataFrame.describe()` method to inspect a `CDataFrame` to get summary statistics of the numeric data.

```import crandas as cd

df = cd.DataFrame({'A': [1, 2, 3, 4, 5], 'B': [-1, -1, -1, -1, -1]})
```
```>>> print(df.describe())
A        B
0   type   integer  integer
1  count         5        5
2   mean       3.0     -1.0
3    std  1.581139      0.0
4    min         1       -1
5    max         5       -1
```

See Working with numeric data for details

## Assigning new columns#

Assigning new columns to an existing table unfortunately is not as simple as it would be in pandas. Using `df["new_col"] = [5,4,3,2,1]` will not work in crandas. To add a column to a `CDataFrame`, we use the `CDataFrame.assign()` method.

```df = cd.DataFrame({"col1":[1,2,3,4,5], "col2": [6,7,8,9,10]})

# Create a new column called 'col3' which is equal to 'col1' - 1
df = df.assign(col3=df["col1"] - 1)

# Create a 2 new columns: "col4" which is equal to 'col1' + 1, and "col5" which is equal to "col2" - 1
df = df.assign(col4=lambda x: x.col1 + 1, col5=lambda x: x.col2 - 1)
```
```>>> print(df.open())
col1  col2  col3  col4  col5
0     1     6     0     2     5
1     2     7     1     3     6
2     3     8     2     4     7
3     4     9     3     5     8
4     5    10     4     6     9
```

## Computing with columns#

Given a column, we can check whether its values are equal, greater (or equal) than or less (or equal) than a given value.

```df = cd.DataFrame({
"id": [1,2,3,4,5],
"is_member": [0,1,1,0,1],
"height": [175,162,151,160,180]
}
)

#We check who has id=2
id_is2 = df["id"] == 2
```
```>>> print(id_is2.as_table().open())
0  0
1  1
2  0
3  0
4  0
```
```#We check which rows hold the information of someone who is 160cm or taller
at_least160 = df["height"] >= 160
```
```>>> print(at_least160.as_table().open())
0  1
1  1
2  0
3  1
4  1
```

Of course, we probably want to do more than just creating this column. We can also filter based on this value (Find out more about filtering in Selecting data):

```df = cd.DataFrame({
"id": [1,2,3,4,5],
"is_member": [0,1,1,0,1],
"height": [175,162,151,160,180]
}
)

# Filter the df such that only members are included (using ==)
member = df[df["is_member"] == 1]

# Print the mean member height
print(member['height'].mean())

# For not_member we can change '==' to '!='
not_member = df[df['is_member'] != 1]

# Print the mean not_member height
print(not_member['height'].mean())
```

Note

You can also use boolean operations over columns, like `(df['is_member'] > 3) & (df['height'] < 160)`. We support and `&`, or `|` and xor `^`.

### Conditionals over columns#

Once you are able to check whether a column fulfills a certain property, you might want to create a new column with a value based on that. crandas has the `CSeries.if_else()` method to make this happen. For example, this can allow you to make categorical columns out of numerical ones, by nesting the function.

```df = cd.DataFrame({
"id": [1,2,3,4,5],
"is_member": [0,1,1,0,1],
"height": [175,162,151,160,180]
}
)

# create a column that shows the cms above 160 in the person's height or -1 if they are shorter
df = df.assign(cms_above_160=(df["height"] >= 160).if_else(df["height"] - 160, -1))

# creates 3 categories for height: 0 for below 160, 1 for between 160 and 170 and 2 for above 170
df = df.assign(height_cats=(df["height"] >= 170).if_else(2, (df["height"] >= 160).if_else(1,0)))
```

Note

crandas provides easier ways to create categorical columns, as seen in Categorical data.

## Manipulating column names#

crandas provides methods to rename, add suffixes, or add prefixes to column names.

The `CDataFrame.rename()` allows you to rename specific columns by providing a dictionary, where the keys are the current column names, and the values are the new column names.

```df = cd.DataFrame({"col1": [1,2,3], "col2": [4,5,6])
df = df.rename(columns={"col1": "col11", "col2": "col22")
>>> print(list(df.columns))
['col11','col22']
```

If instead you would like to add a prefix or a suffix to all column names in a `CDataFrame` then you should use either of the following methods:

```df = cd.DataFrame({"col1": [1,2,3], "col2": [4,5,6])