Week 5
: Data Structures
Lecture Contents
- Data structures are forms of organization in memory.
1. Stacks and Queues
- Queues : one form of abstract data structure (enqueued, dequeued)
- Stacks (push, pop)
FIFO | First in first out |
LIFO | Last in first out |
typedef struct
{
person people[CAPACITY];
int size;
}
stack;
- Limitation of the code above : unable to grow as items are added to it.
2. Resizing Arrays
There's an array :
In memory, there are other values being stored by other programs, functions, and variables.
If you want to store a fourth value '4' in our array.
You can allocate a new area of memory of size 4 and move the old array to a new one.
However, it's bad designed - Every time we add a number, we have to copy the array item by item.
It would be nice if we were able to put the '4' somewhere else in memory. (this would no longer be an array though because it's not contiguous)
list.c
#include <stdio.h>
int main(void)
{
//List of size 3
int list[3];
//Initialize list with numbers
list[0] = 1;
list[1] = 2;
list[2] = 3;
//Print list
for (int i = 0; i < 3; i++)
{
printf("%i\n", list[i]);
}
}
We can leverage our understanding of pointers to create a better design in this code:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
//List of size 3
int *list = malloc(3 * sizeof(int));
if (list == NULL)
{
return 1;
}
//Initialize list of size 3 with numbers
list[0] = 1;
list[1] = 2;
list[2] = 3;
//List of size 4
int *tmp = malloc(4 * sizeof(int));
if (tmp == NULL)
{
free(list);
return 1;
}
//Copy list of size 3 into list of size 4
for (int i = 0; i < 3; i++)
{
tmp[i] = list[i];
}
//Add number to list of size 4
tmp[3] = 4;
//Free list of size 3
free(list);
//Remember list of size 4
list = tmp;
//Print list
for (int i = 0; i < 4; i++)
{
printf("%i\n", list[i]);
}
//Free list
free(list);
return 0;
}
- Notice that the 'tmp' list was created.
- free(list) : since the block of memory that list points to is no longer used.
3. Linked Lists
- three useful primitives
struct | a data type that you can define yourself |
. | dot notation allows you to access variables inside that structure |
* | is used to declare a pointer or dereference a variable |
- -> operator : goes to an address and looks inside of a structure
A linked list is one of the most powerful data structures within C. It allows you to include values that are located at varying areas of memory. They allow you to dynamically grow and shrink the list as you desire.
Three values stored at three different areas of memory :
We could utilize more memory to keep track of where the next item is :
We would keep one more element in memory, a pointer, that keeps track of the first item in the list.
These boxes are called nodes.
A node contains both an item (int number) & a pointer called next.
typedef struct node
{
int number;
struct node *next;
}
node;
Let's implement this picture above in code:
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
typedef struct node
{
int number;
struct node *next;
}
node;
int main(int argc, string agrv[])
{
//Memory for numbers
node *list = NULL;
//For each command-line argument
for (int i = 1; i < agrc; i++)
{
//Convert argument to int
int number = atoi(argv[i]);
//Allocate node for number
node *n = malloc(sizeof(node));
if (n == NULL)
{
return 1;
}
n->number = number;
n->next = NULL;
//Prepend node to list
n->next = list;
list = n;
}
//Print numbers
node *ptr = list;
while (ptr != NULL)
{
printf("%i\n", ptr->number);
ptr = ptr->next;
}
//Free memory
ptr = list;
while (ptr != NULL)
{
node *next = ptr->next;
free(ptr);
ptr = next;
}
}
- Inserting into the list : O(1)
- The amount of time required to search this list : O(n)
Pros
-Since linked lists are not stored in a contiguous block of memory, they can grow as large as you wish.
Cons
-More memory is required to keep track of the list instead of an array. (Pointer)
-Binary search is not possible in a list constructed as above.
★ 헷갈렸던 NULL 개념 정리
type something == NULL;
→ something이 아무것도 pointing 하고있지 않음을 의미함.
You can sort your list as items are added :
#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
typedef struct node
{
int number;
struct node *next;
}
node;
int main(int argc, char *argv[])
{
node *list = NULL;
for(int i = 1; i < agrc; i++)
{
int number = atoi(agrv[i]);
node *n = malloc(sizeof(node));
if(n == NULL)
{
return 1;
}
n->number = number;
n->next = NULL;
if(list == NULL)
{
list = n;
}
else if(n->number < list->number)
{
n->next = list;
list = n;
}
else
{
for (node *ptr = list; ptr != NULL; ptr = ptr->next)
{
if(ptr->next == NULL)
{
ptr->next = n;
break;
}
if (n->number < ptr->next->number)
{
n->next = ptr->next;
ptr->next = n;
break;
}
}
}
for (node *ptr = list; ptr != NULL; ptr = ptr->next)
{
printf("%i\n", ptr->number);
}
node *ptr = list;
while (ptr != NULL)
{
node *next = ptr->next;
free(ptr);
ptr = next;
}
}
4. Trees
- Binary search trees : another data structure to store data more efficiently
- The center value becomes the top of a tree.
- Those that are less than the center value are placed to the left.
- Those values that are more than the center value are to the right.
-Pointers can be used to point to the correct location of each area of memory.
In code :
-Search time : O(logn)
5. Dictionaries
- Dictionaries : another data structure
- They have a key and a value. (like actual book-form dictionaries that have a word and a definition)
- The holy grail of algorithmic time complexity = O(1) or constant time
- Dictionaries can offer this speed of access through hashing.
6. Hashing and hash Tables
- Hashing : the idea of taking a value and being able to output a value that becomes a shortcut to it later.
- A hash function : an algorithm that reduces a larger value to something small and predictable.
- A hash table : a combination of both array and linked lists. An array of pointers to nodes.
A hash table could be imagined as follows:
- An array of alphabet
-When collisions happen, it can be reduced by improving your hash table and hash algorithm.
However, it takes up massive amount of memory.
A hash algorithm:
In code:
#include <ctype.h>
unsigned int hash(const char *word)
{
return toupper(word[0]) - 'A';
}
7. Tries
- Tries : another form of data structure
- always searchable in constant time !
- Downside : take up a large amount of memory
For example, a word 'Toad' would be stored as follows:
Toadette and Tom would be stored as:
5주차에서는 Queues, Linked lists, Trees, Dictionaries, 그리고 Tries 까지 다양한 data structure의 개념과 활용 방법에 대해서 알아보았다. 컴퓨터의 메모리를 정리할 때 어떤 data structure을 사용하느냐에 따라 속도와 메모리가 차지하는 크기가 달라지기 때문에 중요한 영향을 끼친다. 저번주에 이어서 보이지 않는 메모리 영역을 다루고, pointer의 개념이 아직 미숙해서 처음으로 강의를 다시 뒤로 돌려 반복해서 듣기도 한 내용들이다. 아직까지 완벽하게 이해했다고 자신할 수는 없지만, Section과 Pset5를 듣고 고민하는 과정에서 더 큰 이해를 할 수 있길!! 오늘도 수고 많았다 안뇨옹
'CS50' 카테고리의 다른 글
[CS50] Pset 4 : Volume, Filter-less, Recover (5) | 2024.10.22 |
---|---|
[CS50] Week 4 Section (1) | 2024.10.10 |
[CS50] Week 4 Review (4) | 2024.10.09 |
[CS50] Pset 3 : Sort, Plurality, Runoff (1) | 2024.10.06 |
[CS50] Week 3 Section (2) | 2024.10.03 |