## Criação, descrição e utilização de Keyspace

Em Cassandra, **replicação** é o processo de armazenar múltiplas cópias dos dados em diferentes nós dentro de um cluster. Isso é feito para melhorar a **disponibilidade** e a **tolerância a falhas** do sistema.

O **fator de replicação** (`replication_factor`) determina quantas cópias de cada pedaço de dados serão mantidas no cluster. Por exemplo, se você definir `replication_factor` como 3, cada registro de dados será replicado em três nós diferentes.

A replicação assegura que, se um nó falhar ou ficar indisponível, os dados ainda poderão ser acessados a partir de outros nós que possuem as cópias. Isso é crucial para sistemas distribuídos que exigem alta disponibilidade e desempenho consistente

> cqlsh\> CREATE KEYSPACE cbd WITH replication = {'class': 'SimpleStrategy', 'replication\_factor': '1'}; 

> cqlsh\> USE cbd; 

## Criação e descrição de Tabelas

### CREATE

> **cqlsh\> CREATE TABLE alunos (  
> id\_cliente UUID PRIMARY KEY,  
> nome text,  
> sobrenome text,  
> contatos set<text\>,  
> endereco map<text, text\>,  
> registro\_em timestamp  
> );** 

> cqlsh:cbd\> SELECT \* from alunos; 

id\_cliente | contactos | endereco | nome | registro\_em | sobrenome  
------------+-----------+----------+------+-------------+----------- 

### **INSERT**

> **INSERT INTO alunos (id\_aluno, nome, sobrenome, contactos, endereco, registro\_em) VALUES (uuid(), 'Ana', 'Loureiro', {'[ana.sl@ua.pt](mailto:ana.sl@ua.pt)'}, {'rua': 'Rua Calouste Gulbenkian', 'cidade': 'Aveiro'}, toTimestamp(now()))\*\*\*\*;** 

> **CREATE INDEX ON alunos (nome);** 

### **UPDATE**

> **UPDATE alunos SET contactos = contactos + {'[ana.wk.lsl@gmail.com](mailto:ana.wk.lsl@gmail.com)'} WHERE id\_aluno = 0ae479a0-7716-46b7-9e16-ed0716c2ff88;** 

> cqlsh\> SELECT contactos FROM alunos WHERE id\_aluno = 0ae479a0-7716-46b7-9e16-ed0716c2ff88; 

## contactos

{'[ana.sl@ua.pt](mailto:ana.sl@ua.pt)', '[ana.wk.lsl@gmail.com](mailto:ana.wk.lsl@gmail.com)'}

### DELETE

> **DELETE FROM alunos WHERE id\_aluno = 0ae479a0-7716-46b7-9e16-ed0716c2ff88;** 

### 

### TTL

> **cqlsh:cbd\> INSERT INTO alunos (id\_aluno, nome, sobrenome) VALUES (uuid(), 'Tiago', 'Guimaraes') USING TTL 60;** 

> **cqlsh:cbd\> SELECT TTL(nome) FROM alunos WHERE nome = 'Tiago' AND sobrenome = 'Guimaraes' ALLOW FILTERING;** 

## Native aggregates

### Count

The count function can be used to count the rows returned by a query. Example:

SELECT COUNT (\*) FROM plays;  
SELECT COUNT (1) FROM plays;  
It also can be used to count the non null value of a given column:

SELECT COUNT (scores) FROM plays;

### Max and Min

The max and min functions can be used to compute the maximum and the minimum value returned by a query for a given column. For instance:

SELECT MIN (players), MAX (players) FROM plays WHERE game = 'quake';

### Sum

The sum function can be used to sum up all the values returned by a query for a given column. For instance:

SELECT SUM (players) FROM plays;

### Avg

The avg function can be used to compute the average of all the values returned by a query for a given column. For instance:

SELECT AVG (players) FROM plays;
\*\*Aspect\*\*
\*\*UDF (User-Defined Function)\*\*
\*\*UDA (User-Defined Aggregate)\*\*
\*\*Scope\*\*
Operates on one row or column
Operates across multiple rows
\*\*Purpose\*\*
Transform or compute column values
Aggregate values from multiple rows
\*\*State\*\*
Stateless
Stateful
\*\*Input\*\*
One or more column values (single row)
One column value (multiple rows)
\*\*Output\*\*
Single result for each input
Single aggregated result
\*\*Definition\*\*
\`CREATE FUNCTION\`
\`CREATE FUNCTION\` + \`CREATE AGGREGATE\`

  
### **How the State Function Works in Cassandra**

1. **Input**:
  * The **current state** (e.g., a running total).
  * A **value from the current row** (e.g., attack values).
2. **Processing**:
  * The state function updates the state using the row's value.
  * Returns the updated state.
3. **Final Output**:
  * Once all rows are processed, the final state becomes the UDA's result.

  
  
#### Steps:

1. **Initial State**: `0`.
2. **Row 1 (Togekiss)**:
  * Attack power: `75 + 80 + 90 = 245`.
  * New state: `0 + 245 = 245`.
3. **Row 2 (Torterra)**:
  * Attack power: `100 + 120 = 220`.
  * New state: `245 + 220 = 465`.

#### Final Result:

The UDA returns `465`.

---

### **Conclusion**

A **state function** is essential for a UDA because:

1. It accumulates intermediate results across rows.
2. It defines the logic to process each row and update the state.
3. It enables Cassandra to compute aggregate values efficiently without materializing the entire result set.

  
PreparedStatement ps = session.prepare("SELECT attacks FROM pokemons WHERE pokemon\_id = ?;"); BoundStatement bs = ps.bind(pokemonId); ResultSet pokemonResults = session.execute(bs);  

  
**CQL Syntax Requirement**: Unlike some SQL dialects, Cassandra's CQL requires a `FROM` clause in every `SELECT` statement. You cannot execute a `SELECT` without specifying a table.  

  
One way to satisfy the `FROM` requirement is to select from a system table or a dummy table. The `system.local` table is present in every Cassandra keyspace and contains only one row.

**Modified Query**:
    
    sqlCopiar código`SELECT pokemon_total_attack(?) AS pokemon_attack_power FROM system.local LIMIT 1`