Data Warehousing
MOLAP: Multidimensional OLAP in Data Warehouse
What is MOLAP? Multidimensional OLAP (MOLAP) is a classical OLAP that facilitates data analysis by...
Jupyter Notebook is an open-source web application for writing and sharing live codes, equations, visualizations with rich text elements. It provides a convenient way to write paragraph, equations, titles, links and figures to run data analysis. It is also useful for sharing interactive algorithms with your audience for teaching or demonstrating purpose.
In this Jupyter Notebook tutorial, you will learn-
The Jupyter Notebook App is the interface where you can write your scripts and codes through your web browser. The app can be used locally, meaning you don't need internet access, or a remote server.
Each computation is done via a kernel. A new kernel is created each time you launch a Jupyter Notebook.
In the session below, you will learn how to use Jupyter Notebook. You will write a simple line of code to get familiar with the environment of Jupyter.
Step 1) You add a folder inside the working directory that will contains all the notebooks you will create during the tutorials about TensorFlow.
Open the Terminal and write
mkdir jupyter_tf jupyter notebook
Code Explanation
Step 2) You can see the new folder inside the environment. Click on the folder jupyter_tf.
Step 3) Inside this folder, you will create your first notebook. Click on the button New and Python 3.
Step 4) You are inside the Jupyter environment. So far, your notebook is called Untiltled.ipynb. This is the default name given by Jupyter. Let's rename it by clicking on File and Rename
You can rename it Introduction_jupyter
In AWS Jupyter Notebook, you write codes, annotation or text inside the cells.
Inside a cell, you can write a single line of code.
or multiple lines. Jupyter reads the code one line after another.
For instance, if you write following code inside a cell.
It will produce this output.
Step 5) You are ready to write your first line of code. You can notice the cell have two colors. The green color mean you are in the editing mode.
The blue color, however, indicates you are in executing mode.
You first line of code will be to print gtupapers!. Inside the cell, you can write
print("gtupapers!")There are two ways to run a code in Jupyter:
To run the code, you can click on Cell and then Run Cells and Select Below
You can see the code is printed below the cell and a new cell has appeared right after the output.
A faster way to run a code is to use the Keyboard Shortcuts. To access the Keyboard Shortcuts, go to Help and Keyboard Shortcuts
Below the list of shortcuts for a MacOS keyboard. You can edit the shortcuts in the editor.
Following are shortcuts for Windows
Write this line
print("Hello world!")and try to use the Keyboard Shortcuts to run the code. Use alt+enter. it will execute the cell and insert a new empty cell below, like you did before.
Step 6) It is time to close the Notebook. Go to File and click on Close and Halt
Note: Jupyter automatically saves the notebook with checkpoint. If you have the following message:
It means Jupyter didn't save the file since the last checkpoint. You can manually save the notebook
You will be redirected to the main panel. You can see your notebook has been saved a minute ago. You can safely logout.
Below is a step by step process on how to install and run Jupyter Notebook on AWS:
If you do not have an account at AWS, create a free account here.
We will proceed as follow
Step 1) Go to Services and find EC2
Step 2) In the panel and click on Key Pairs
Step 3) Click Create Key Pair
A file name Docker_key.pem downloads.
Step 4) Copy and paste it into the folder key. We will need it soon.
For Mac OS user only
This step concerns only Mac OS user. For Windows or Linux users, please proceed to PART 2
You need to set a working directory that will contain the file key
First of all, create a folder named key. For us, it is located inside the main folder Docker. Then, you set this path as your working directory
mkdir Docker/key cd Docker/key
Step 1) You need to configure a security group. You can access it with the panel
Step 2) Click on Create Security Group
Step 3) In the next Screen
Step 4) The newly created Security Group will be listed
Part 3: Launch instance
You are finally ready to create the instance
Step 1) Click on Launch Instance
The default server is enough for your need. You can choose Amazon Linux AMI. The current instance is 2018.03.0.
AMI stands for Amazon Machine Image. It contains the information required to successfully starts an instance that run on a virtual server stored in the cloud.
Note that AWS has a server dedicated to deep learning such as:
All of them Comes with latest binaries of deep learning frameworks pre-installed in separate virtual environments:
Fully-configured with NVidia CUDA, cuDNN and NCCL as well as Intel MKL-DNN
Step 2) Choose t2.micro. It is a free tier server. AWS offers for free this virtual machine equipped with 1 vCPU and 1 GB of memory. This server provides a good tradeoff between computation, memory and network performance. It fits for small and medium database
Step 3) Keep settings default in next screen and click Next: Add Storage
Step 4) Increase storage to 10GB and click Next
Step 5) Keep settings default and click Next: Configure Security Group
Step 6) Choose the security group you created before, which is jupyter_docker
Step 7) Review your settings and Click the launch button
Step 8 ) The last step is to link the key pair to the instance.
Step 8) Instance will launch
Step 9) Below a summary of the instances currently in use. Note the public IP
Step 9) Click on Connect
You will find the connection detials
Launch your instance (Mac OS users)
At first make sure that inside the terminal, your working directory points to the folder with the key pair file docker
run the code
chmod 400 docker.pem
Open the connection with this code.
There are two codes. in some case, the first code avoids Jupyter to open the notebook.
In this case, use the second one to force the connection in Jupyter Notebook on EC2.
# If able to launch Jupyter ssh -i "docker.pem" This email address is being protected from spambots. You need JavaScript enabled to view it. # If not able to launch Jupyter ssh -i "docker.pem" This email address is being protected from spambots. You need JavaScript enabled to view it. -L 8888:127.0.0.1:8888
The first time, you are prompted to accept the connection
Step 1) Go to this website to download PuTTY and PuTTYgen PuTTY
You need to download
Now that both software are installed, you need to convert the .pem file to .ppk. PuTTY can only read .ppk. The pem file contains the unique key created by AWS.
Step 2) Open PuTTYgen and click on Load. Browse the folder where the .pem file is located.
Step 3)After you loaded the file, you should get a notice informing you that the key has been successfully imported. Click on OK
Step 4) Then click on Save private key. You are asked if you want to save this key without a passphrase. Click on yes.
Step 5) Save the Key
Step 6) Go to AWS and copy the public DNS
Open PuTTY and paste the Public DNS in the Host Name
Step 7)
Step 8)
When this step is done, a new window will be opened. Click Yes if you see this pop-up
Step 9)
You need to login as: ec2-user
Step 10)
You are connected to the Amazon Linux AMI.
While you are connected with the server via Putty/Terminal, you can install Docker container.
Execute the following codes
sudo yum update -y sudo yum install -y docker sudo service docker start sudo user-mod -a -G docker ec2-user exit
Launch again the connection
ssh -i "docker.pem" This email address is being protected from spambots. You need JavaScript enabled to view it. -L 8888:127.0.0.1:8888Windows users use SSH as mentioned above
Step 1) Create Jupyter with,
pre-built image.
## Tensorflow docker run -v ~/work:/home/jovyan/work -d -p 8888:8888 jupyter/tensorflow-notebook ## Sparkdocker run -v ~/work:/home/jovyan/work -d -p 8888:8888 jupyter/pyspark-notebook
Code Explanation
For other pre-build images, go here
Allow preserving Jupyter AWS notebook
sudo chown 1000 ~/work
Step 2) Install tree to see,
our working directory next
sudo yum install -y tree
Step 3) Check the container and its name
Use command
docker ps
docker logs vigilant_easley
Step 4) In the URL,
Replace (90a3c09282d6 or 127.0.0.1) with Public DNS of your instance
http://(90a3c09282d6 or 127.0.0.1):8888/?token=f460f1e79ab74c382b19f90fe3fd55f9f99c5222365eceed
Step 5) The new URL becomes,
http://ec2-174-129-135-16.compute-1.amazonaws.com:8888/?token=f460f1e79ab74c382b19f90fe3fd55f9f99c5222365eceed
Step 6) Copy and paste the URL into your browser.
Jupyter Opens
Step 7) You can write a new Notebook,
in your work folder
Close the connection in the terminal
exit
Go back to AWS and stop the server.
If ever docker doesnot work, try to rebuilt image using
docker run -v ~/work:/home/jovyan/work -d -p 8888:8888 jupyter/tensorflow-notebook
What is MOLAP? Multidimensional OLAP (MOLAP) is a classical OLAP that facilitates data analysis by...
What is Data Lake? A Data Lake is a storage repository that can store large amount of structured,...
What is Data Modelling? Data modeling (data modelling) is the process of creating a data model for the...
What is Tableau? Tableau is a powerful and fastest growing data visualization tool used in the...
Dimensional Modeling Dimensional Modeling (DM) is a data structure technique optimized for data...
Reporting tools are software that provides reporting, decision making, and business intelligence...