software engineer, consultant, conference speaker, #tech4good, #stacktivism

this week - 040

-

A lot of snowflake this week! Let's get into #dataengineering

Introducing Snowflake's Workspaces new IDE is not my favorite YET. Why I'm still #TeamVSCode for my data engineering

I know matching any IDE to VS code is going to be a tall order. I'm also bias, I used to do Developer and Community engagement with my favorite team on the planet, but there are a few quirks that make me excited about the Workspaces GA, but there's much to be desired.

Picking a Snowflake Notebook - Screenshot 2025-09-29 at 6.17.43 AM

Cons:

  • I miss my REPL! Without a terminal, I'm not able to iterate through testing the API. I also love the shift+enter functionality in IDEs these days and intrinsic to Jupyter to be able to run snippets of the file. If I'm building iteratively for an enterprise pipeline, I'm trying to make my code flexible and dynamic and check values as I'm writing.
  • The syntax highlighting leaves much to be desired.
  • Defining the entry point function and not having to call it in your file is odd, but I think is a nice syntactic sugar / feature for non-programmers.
  • No GitHub integration, no terminal, no programmatic version control.
  • Missing my ctrl+s being an auto save.
  • I know Anaconda has a legacy of slow builds but once your environment is built you're not worried about speed. Each run is entirely too slow and there's no way to know if its rebuilding the environment each time or if that is cached. More speed and a stack trace would be ideal.

Pros:

  • Snowflake's Copilot from Microsoft is in here! yay! I'm still using Claude Code as my favorite tool, but if this is the same copilot (+agent) in VS Code then I'm likely to switch.
  • The starter project for Python API does give recommendations but its all in one main() function.
  • Since we're not building with the conda environment exposed, you cannot add version control. That would be a definite con if the team didn't add the versions button:
Snowflake Workspace Versions Window Screenshot 2025-10-01 at 12.08.43 PM
  • It has Anaconda in it! Makes it easy to keep playing with the these data engineering scenarios and fit my work time. I was concerned when I heard that it supports a custom distribution of Anaconda packages, but if you want more packages you have to go to pypi.
Picking Packages in Snowflake Workspace - Screenshot 2025-09-29 at 7.56.38 AM

For those not familiar, switching between pip and conda package mangers with two different Package repositories, is not best practice. You'll end up with some inscrutable issue and have to nuke your environment. I love PyPI, but this seems to get back to Snowflake's strategic decision of not having a command line terminal in their IDE, likely because there are too many accidents you can slip using pip then conda then uv then pixi then back to pip . The solution is actually pretty elegant even though it forces you into a UI.

conda-pypi documentation Screenshot 2025-10-01 at 12.32.13 PM
Keep an eye out for conda-pypi - a project in the conda incubator that's getting more activity these days. 👀 I *love* the conda ecosystem. There's been lots of great movement with the tools that consume wheel format packages (the default) and conda ecosystems contributors excited to put more rust in our toolchain and be open to collaborating on package artifacts and formats, sharing learned lessons with the rest of the ecosystem is admirable. We all get a better Python developer experience for it.

An exciting opportunity with the IDE is getting autocomplete and help text for Snowflakes very confusing API. For example: Should I be using root.database.schema.tables.create() or snowpark.session.sql() and pass in f-strings. My current recommendation is the later is more flexible although syntax highlighting in f-strings is not easy and I'm more prone to mistakes that way. I would love for Snowflake to suggest best practice.

Ultimately it feels like the beginning of a great feature. But with the IDE market being so well supported, Python developers are spoiled for choice. I can tell you from the many IDE conversations we're having at Anaconda, there is a lot of iteration and user feedback necessary to support an IDE for long term. I will always have a special place in my heart for the VS Code team because I saw their PMs and engineers put iterative improvement, product feedback, testing, experimentation, sensible defaults, UI + file-based settings at the core of their product.

I'm sure the real value add is the one click deploy button, but I'm hit with so many points of friction before I can click "deploy" that I felt compelled to write about it. I will update once I get there.

Creating Snowflake tables programmatically with DDL (Data Definition Language)

With Snowflake, whether you're using their workspaces, streamlit, a snowflake notebook or your non-snowflake IDE of choice, you may find yourself wanting to create, manipulate and modify objects. In Snowflake everything is an object: users and roles (account objects / "securable objects"), virtual warehouses (technically compute resources / "securable objects"), databases, User-Defined functions (UDFs), external functions, schemas, tables, views, columns, functions and even stored procedures.

Securable objects are objects in the security model, but are different than the database objects that live within the schema hierarchy.

snowflake-object
    Account
├── Databases
│   └── Schemas
│       └── Tables/Views/Functions/etc.
├── Warehouses (compute)
└── Users/Roles (security)
Snowflake's object hierachy
What is DDL?

I was a SQL analyst over a decade ago, so I'm brushing off some of my SQL programming principals.

Data Definition Language is a subset of SQL commands used to define and manage database schema eg. CREATE , ALTER , DROP , TRUNCATE , RENAME

Other SQL types are Data Manipulation Language eg. SELECT,`INSERT` , `UPDATE` and DELETE Data Control Language, and Transactional Control Language COMMIT , ROLLBACK

What are Snowflake's UDFs?

User Defined Functions (UDFs) are used to extend functionality beyond built-in system-defined functions. Snowflake is a SQL based service that extends its tooling for their supported languages. The supported functions are outlined in the SQL versions SQL:1999 and SQL:2003.