# Setup - Run only once per Kernel App
%conda install openjdk -y
# install PySpark
%pip install pyspark==3.4.0
# install spark-nlp
%pip install spark-nlp==5.1.3
%pip install sparknlp
# install plotly
%pip install plotly
# install yfinance for external data
%pip install yfinance
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")Collecting package metadata (current_repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 23.3.1
latest version: 23.10.0
Please update conda by running
$ conda update -n base -c defaults conda
Or to minimize the number of packages updated during conda update use
conda install conda=23.10.0
## Package Plan ##
environment location: /opt/conda
added / updated specs:
- openjdk
The following packages will be downloaded:
package | build
---------------------------|-----------------
ca-certificates-2023.08.22 | h06a4308_0 123 KB
certifi-2023.7.22 | py310h06a4308_0 153 KB
openjdk-11.0.13 | h87a67e3_0 341.0 MB
------------------------------------------------------------
Total: 341.3 MB
The following NEW packages will be INSTALLED:
openjdk pkgs/main/linux-64::openjdk-11.0.13-h87a67e3_0
The following packages will be UPDATED:
ca-certificates conda-forge::ca-certificates-2023.7.2~ --> pkgs/main::ca-certificates-2023.08.22-h06a4308_0
The following packages will be SUPERSEDED by a higher-priority channel:
certifi conda-forge/noarch::certifi-2023.7.22~ --> pkgs/main/linux-64::certifi-2023.7.22-py310h06a4308_0
Downloading and Extracting Packages
certifi-2023.7.22 | 153 KB | | 0%
openjdk-11.0.13 | 341.0 MB | | 0%
ca-certificates-2023 | 123 KB | | 0%
certifi-2023.7.22 | 153 KB | ##################################### | 100%
openjdk-11.0.13 | 341.0 MB | 4 | 1%
openjdk-11.0.13 | 341.0 MB | ##1 | 6%
openjdk-11.0.13 | 341.0 MB | ###3 | 9%
openjdk-11.0.13 | 341.0 MB | #####1 | 14%
openjdk-11.0.13 | 341.0 MB | ####### | 19%
openjdk-11.0.13 | 341.0 MB | ########9 | 24%
openjdk-11.0.13 | 341.0 MB | ##########6 | 29%
openjdk-11.0.13 | 341.0 MB | ############5 | 34%
openjdk-11.0.13 | 341.0 MB | ##############2 | 38%
openjdk-11.0.13 | 341.0 MB | ###############8 | 43%
openjdk-11.0.13 | 341.0 MB | #################6 | 48%
openjdk-11.0.13 | 341.0 MB | ###################4 | 52%
openjdk-11.0.13 | 341.0 MB | #####################2 | 57%
openjdk-11.0.13 | 341.0 MB | ####################### | 62%
openjdk-11.0.13 | 341.0 MB | ########################9 | 68%
openjdk-11.0.13 | 341.0 MB | ##########################8 | 73%
openjdk-11.0.13 | 341.0 MB | ############################6 | 77%
openjdk-11.0.13 | 341.0 MB | ##############################4 | 82%
openjdk-11.0.13 | 341.0 MB | ################################3 | 87%
openjdk-11.0.13 | 341.0 MB | ##################################1 | 92%
openjdk-11.0.13 | 341.0 MB | ###################################9 | 97%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Note: you may need to restart the kernel to use updated packages.
Collecting pyspark==3.4.0
Using cached pyspark-3.4.0-py2.py3-none-any.whl
Collecting py4j==0.10.9.7 (from pyspark==3.4.0)
Using cached py4j-0.10.9.7-py2.py3-none-any.whl (200 kB)
Installing collected packages: py4j, pyspark
Successfully installed py4j-0.10.9.7 pyspark-3.4.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
Collecting spark-nlp==5.1.3
Obtaining dependency information for spark-nlp==5.1.3 from https://files.pythonhosted.org/packages/cd/7d/bc0eca4c9ec4c9c1d9b28c42c2f07942af70980a7d912d0aceebf8db32dd/spark_nlp-5.1.3-py2.py3-none-any.whl.metadata
Using cached spark_nlp-5.1.3-py2.py3-none-any.whl.metadata (53 kB)
Using cached spark_nlp-5.1.3-py2.py3-none-any.whl (537 kB)
Installing collected packages: spark-nlp
Successfully installed spark-nlp-5.1.3
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
Collecting sparknlp
Using cached sparknlp-1.0.0-py3-none-any.whl (1.4 kB)
Requirement already satisfied: spark-nlp in /opt/conda/lib/python3.10/site-packages (from sparknlp) (5.1.3)
Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (from sparknlp) (1.26.0)
Installing collected packages: sparknlp
Successfully installed sparknlp-1.0.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
Requirement already satisfied: plotly in /opt/conda/lib/python3.10/site-packages (5.9.0)
Requirement already satisfied: tenacity>=6.2.0 in /opt/conda/lib/python3.10/site-packages (from plotly) (8.0.1)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
Collecting yfinance
Obtaining dependency information for yfinance from https://files.pythonhosted.org/packages/1c/19/bf19123baf16a55fd38cbb100b5a49380b9b6db7279987034689d11254c7/yfinance-0.2.32-py2.py3-none-any.whl.metadata
Using cached yfinance-0.2.32-py2.py3-none-any.whl.metadata (11 kB)
Requirement already satisfied: pandas>=1.3.0 in /opt/conda/lib/python3.10/site-packages (from yfinance) (1.4.4)
Requirement already satisfied: numpy>=1.16.5 in /opt/conda/lib/python3.10/site-packages (from yfinance) (1.26.0)
Requirement already satisfied: requests>=2.31 in /opt/conda/lib/python3.10/site-packages (from yfinance) (2.31.0)
Collecting multitasking>=0.0.7 (from yfinance)
Using cached multitasking-0.0.11-py3-none-any.whl (8.5 kB)
Requirement already satisfied: lxml>=4.9.1 in /opt/conda/lib/python3.10/site-packages (from yfinance) (4.9.3)
Requirement already satisfied: appdirs>=1.4.4 in /opt/conda/lib/python3.10/site-packages (from yfinance) (1.4.4)
Collecting pytz>=2022.5 (from yfinance)
Obtaining dependency information for pytz>=2022.5 from https://files.pythonhosted.org/packages/32/4d/aaf7eff5deb402fd9a24a1449a8119f00d74ae9c2efa79f8ef9994261fc2/pytz-2023.3.post1-py2.py3-none-any.whl.metadata
Using cached pytz-2023.3.post1-py2.py3-none-any.whl.metadata (22 kB)
Collecting frozendict>=2.3.4 (from yfinance)
Using cached frozendict-2.3.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (115 kB)
Collecting peewee>=3.16.2 (from yfinance)
Using cached peewee-3.17.0-cp310-cp310-linux_x86_64.whl
Requirement already satisfied: beautifulsoup4>=4.11.1 in /opt/conda/lib/python3.10/site-packages (from yfinance) (4.11.1)
Collecting html5lib>=1.1 (from yfinance)
Using cached html5lib-1.1-py2.py3-none-any.whl (112 kB)
Requirement already satisfied: soupsieve>1.2 in /opt/conda/lib/python3.10/site-packages (from beautifulsoup4>=4.11.1->yfinance) (2.3.1)
Requirement already satisfied: six>=1.9 in /opt/conda/lib/python3.10/site-packages (from html5lib>=1.1->yfinance) (1.16.0)
Requirement already satisfied: webencodings in /opt/conda/lib/python3.10/site-packages (from html5lib>=1.1->yfinance) (0.5.1)
Requirement already satisfied: python-dateutil>=2.8.1 in /opt/conda/lib/python3.10/site-packages (from pandas>=1.3.0->yfinance) (2.8.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.10/site-packages (from requests>=2.31->yfinance) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.10/site-packages (from requests>=2.31->yfinance) (3.3)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.10/site-packages (from requests>=2.31->yfinance) (2.0.6)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.10/site-packages (from requests>=2.31->yfinance) (2023.7.22)
Using cached yfinance-0.2.32-py2.py3-none-any.whl (68 kB)
Using cached pytz-2023.3.post1-py2.py3-none-any.whl (502 kB)
Installing collected packages: pytz, peewee, multitasking, html5lib, frozendict, yfinance
Attempting uninstall: pytz
Found existing installation: pytz 2022.1
Uninstalling pytz-2022.1:
Successfully uninstalled pytz-2022.1
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spyder 5.3.3 requires pyqt5<5.16, which is not installed.
spyder 5.3.3 requires pyqtwebengine<5.16, which is not installed.
jupyterlab-server 2.10.3 requires jupyter-server~=1.4, but you have jupyter-server 2.7.3 which is incompatible.
sagemaker-datawrangler 0.4.3 requires sagemaker-data-insights==0.4.0, but you have sagemaker-data-insights 0.3.3 which is incompatible.
spyder 5.3.3 requires ipython<8.0.0,>=7.31.1, but you have ipython 8.16.1 which is incompatible.
spyder 5.3.3 requires pylint<3.0,>=2.5.0, but you have pylint 3.0.1 which is incompatible.
Successfully installed frozendict-2.3.8 html5lib-1.1 multitasking-0.0.11 peewee-3.17.0 pytz-2023.3.post1 yfinance-0.2.32
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.